Latest Papers on Radiology AI.

Breast Cancer Diagnosis Using a Dual-Modality Complementary Deep Learning Network With Integrated Attention Mechanism Fusion of B-Mode Ultrasound and Shear Wave Elastography.

Dong L, Cai X, Ge H, Sun L, Pan X, Sun F, Meng Q

•papers•Aug 25 2025

To develop and evaluate a Dual-modality Complementary Feature Attention Network (DCFAN) that integrates spatial and stiffness information from B-mode ultrasound and shear wave elastography (SWE) for improved breast tumor classification and axillary lymph node (ALN) metastasis prediction. A total of 387 paired B-mode and SWE images from 218 patients were retrospectively analyzed. The proposed DCFAN incorporates attention mechanisms to effectively fuse structural features from B-mode ultrasound with stiffness features from SWE. Two classification tasks were performed: (1) differentiating benign from malignant tumors, and (2) classifying benign tumors, malignant tumors without ALN metastasis, and malignant tumors with ALN metastasis. Model performance was assessed using accuracy, sensitivity, specificity, and AUC, and compared with conventional CNN-based models and two radiologists with varying experience. In Task 1, DCFAN achieved an accuracy of 94.36% ± 1.45% and the highest AUC of 0.97. In Task 2, it attained 91.70% ± 3.77% accuracy and an average AUC of 0.83. The multimodal approach significantly outperformed the single-modality models in both tasks. Notably, in Task 1, DCFAN demonstrated higher specificity (94.9%) compared to the experienced radiologist (p = 0.002), and yielded higher F1-scores than both radiologists. It also outperformed several state-of-the-art deep learning models in diagnostic accuracy. DCFAN demonstrated robust and superior performance over existing CNN-based methods and radiologists in both breast tumor classification and ALN metastasis prediction. This approach may serve as a valuable assistive tool to enhance diagnostic accuracy in breast ultrasound.

Ultrasound Classification Breast Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Motion Management in Positron Emission Tomography/Computed Tomography and Positron Emission Tomography/Magnetic Resonance.

Guo L, Liu C, Soultanidis G

•papers•Aug 25 2025

Motion in clinical positron emission tomography (PET) examinations degrades image quality and quantification, requiring tailored correction strategies. Recent advancements integrate external devices and/or data-driven motion tracking with image registration and motion modeling, particularly deep learning-based methods, to address complex motion scenarios. The development of total-body PET systems with long axial field-of-view enables advanced motion correction by leveraging extended coverage and continuous acquisition. These innovations enhance the accuracy of motion estimation and correction across various clinical applications, improve quantitative reliability in static and dynamic imaging, and enable more precise assessments in oncology, neurology, and cardiovascular PET studies.

Mixed Modality Reconstruction Whole Body Review In Silico Benchmark SOTA

Multimodal Positron Emission Tomography/Computed Tomography Radiomics Combined with a Clinical Model for Preoperative Prediction of Invasive Pulmonary Adenocarcinoma in Ground-Glass Nodules.

Wang X, Li P, Li Y, Zhang R, Duan F, Wang D

•papers•Aug 25 2025

To develop and validate predictive models based on 18F-fluorodeoxyglucose positron emission tomography/computed tomography (18F-FDG PET/CT) radiomics and a clinical model for differentiating invasive adenocarcinoma (IAC) from non-invasive ground-glass nodules (GGNs) in early-stage lung cancer. A total of 164 patients with GGNs histologically confirmed as part of the lung adenocarcinoma spectrum (including both invasive and non-invasive subtypes) who underwent preoperative 18F-FDG PET/CT and surgery. Radiomic features were extracted from PET and CT images. Models were constructed using support vector machine (SVM), random forest (RF), and extreme gradient boosting (XGBoost). Five predictive models (CT, PET, PET/CT, Clinical, Combined) were evaluated using receiver operating characteristic (ROC) curves, decision curve analysis (DCA), and calibration curves. Statistical comparisons were performed using DeLong's test, net reclassification improvement (NRI), and integrated discrimination improvement (IDI). The Combined model, integrating PET/CT radiomic features with the clinical model, achieved the highest diagnostic performance (AUC: 0.950 in training, 0.911 in test). It consistently showed superior IDI and NRI across both cohorts and significantly outperformed the clinical model (DeLong p = 0.027), confirming its enhanced predictive power through multimodal integration. A clinical nomogram was constructed from the final model to support individualized risk stratification. Integrating PET/CT radiomic features with a clinical model significantly enhances the preoperative prediction of GGN invasiveness. This multimodal image data may assist in preoperative risk stratification and support personalized surgical decision-making in early-stage lung adenocarcinoma.

Mixed Modality Classification Chest Retrospective Clinical In Silico Academic Lab

2D Ultrasound Elasticity Imaging of Abdominal Aortic Aneurysms Using Deep Neural Networks

Utsav Ratna Tuladhar, Richard Simon, Doran Mix, Michael Richards

•preprint•Aug 25 2025

Abdominal aortic aneurysms (AAA) pose a significant clinical risk due to their potential for rupture, which is often asymptomatic but can be fatal. Although maximum diameter is commonly used for risk assessment, diameter alone is insufficient as it does not capture the properties of the underlying material of the vessel wall, which play a critical role in determining the risk of rupture. To overcome this limitation, we propose a deep learning-based framework for elasticity imaging of AAAs with 2D ultrasound. Leveraging finite element simulations, we generate a diverse dataset of displacement fields with their corresponding modulus distributions. We train a model with U-Net architecture and normalized mean squared error (NMSE) to infer the spatial modulus distribution from the axial and lateral components of the displacement fields. This model is evaluated across three experimental domains: digital phantom data from 3D COMSOL simulations, physical phantom experiments using biomechanically distinct vessel models, and clinical ultrasound exams from AAA patients. Our simulated results demonstrate that the proposed deep learning model is able to reconstruct modulus distributions, achieving an NMSE score of 0.73\%. Similarly, in phantom data, the predicted modular ratio closely matches the expected values, affirming the model's ability to generalize to phantom data. We compare our approach with an iterative method which shows comparable performance but higher computation time. In contrast, the deep learning method can provide quick and effective estimates of tissue stiffness from ultrasound images, which could help assess the risk of AAA rupture without invasive procedures.

Ultrasound Reconstruction Abdominal Methodology In Silico

Emerging Semantic Segmentation from Positive and Negative Coarse Label Learning

Le Zhang, Fuping Wu, Arun Thirunavukarasu, Kevin Bronik, Thomas Nichols, Bartlomiej W. Papiez

•preprint•Aug 25 2025

Large annotated datasets are vital for training segmentation models, but pixel-level labeling is time-consuming, error-prone, and often requires scarce expert annotators, especially in medical imaging. In contrast, coarse annotations are quicker, cheaper, and easier to produce, even by non-experts. In this paper, we propose to use coarse drawings from both positive (target) and negative (background) classes in the image, even with noisy pixels, to train a convolutional neural network (CNN) for semantic segmentation. We present a method for learning the true segmentation label distributions from purely noisy coarse annotations using two coupled CNNs. The separation of the two CNNs is achieved by high fidelity with the characters of the noisy training annotations. We propose to add a complementary label learning that encourages estimating negative label distribution. To illustrate the properties of our method, we first use a toy segmentation dataset based on MNIST. We then present the quantitative results of experiments using publicly available datasets: Cityscapes dataset for multi-class segmentation, and retinal images for medical applications. In all experiments, our method outperforms state-of-the-art methods, particularly in the cases where the ratio of coarse annotations is small compared to the given dense annotations.

OCT Segmentation Methodology In Silico Academic Lab Benchmark SOTA

FCR: Investigating Generative AI models for Forensic Craniofacial Reconstruction

Ravi Shankar Prasad, Dinesh Singh

•preprint•Aug 25 2025

Craniofacial reconstruction in forensics is one of the processes to identify victims of crime and natural disasters. Identifying an individual from their remains plays a crucial role when all other identification methods fail. Traditional methods for this task, such as clay-based craniofacial reconstruction, require expert domain knowledge and are a time-consuming process. At the same time, other probabilistic generative models like the statistical shape model or the Basel face model fail to capture the skull and face cross-domain attributes. Looking at these limitations, we propose a generic framework for craniofacial reconstruction from 2D X-ray images. Here, we used various generative models (i.e., CycleGANs, cGANs, etc) and fine-tune the generator and discriminator parts to generate more realistic images in two distinct domains, which are the skull and face of an individual. This is the first time where 2D X-rays are being used as a representation of the skull by generative models for craniofacial reconstruction. We have evaluated the quality of generated faces using FID, IS, and SSIM scores. Finally, we have proposed a retrieval framework where the query is the generated face image and the gallery is the database of real faces. By experimental results, we have found that this can be an effective tool for forensic science.

X-Ray Image Synthesis Methodology In Silico GenAI

Why Relational Graphs Will Save the Next Generation of Vision Foundation Models?

Fatemeh Ziaeetabar

•preprint•Aug 25 2025

Vision foundation models (FMs) have become the predominant architecture in computer vision, providing highly transferable representations learned from large-scale, multimodal corpora. Nonetheless, they exhibit persistent limitations on tasks that require explicit reasoning over entities, roles, and spatio-temporal relations. Such relational competence is indispensable for fine-grained human activity recognition, egocentric video understanding, and multimodal medical image analysis, where spatial, temporal, and semantic dependencies are decisive for performance. We advance the position that next-generation FMs should incorporate explicit relational interfaces, instantiated as dynamic relational graphs (graphs whose topology and edge semantics are inferred from the input and task context). We illustrate this position with cross-domain evidence from recent systems in human manipulation action recognition and brain tumor segmentation, showing that augmenting FMs with lightweight, context-adaptive graph-reasoning modules improves fine-grained semantic fidelity, out of distribution robustness, interpretability, and computational efficiency relative to FM only baselines. Importantly, by reasoning sparsely over semantic nodes, such hybrids also achieve favorable memory and hardware efficiency, enabling deployment under practical resource constraints. We conclude with a targeted research agenda for FM graph hybrids, prioritizing learned dynamic graph construction, multi-level relational reasoning (e.g., part object scene in activity understanding, or region organ in medical imaging), cross-modal fusion, and evaluation protocols that directly probe relational competence in structured vision tasks.

MRI Segmentation Neurological Review Concept GenAI

UniSino: Physics-Driven Foundational Model for Universal CT Sinogram Standardization

Xingyu Ai, Shaoyu Wang, Zhiyuan Jia, Ao Xu, Hongming Shan, Jianhua Ma, Qiegen Liu

•preprint•Aug 25 2025

During raw-data acquisition in CT imaging, diverse factors can degrade the collected sinograms, with undersampling and noise leading to severe artifacts and noise in reconstructed images and compromising diagnostic accuracy. Conventional correction methods rely on manually designed algorithms or fixed empirical parameters, but these approaches often lack generalizability across heterogeneous artifact types. To address these limitations, we propose UniSino, a foundation model for universal CT sinogram standardization. Unlike existing foundational models that operate in image domain, UniSino directly standardizes data in the projection domain, which enables stronger generalization across diverse undersampling scenarios. Its training framework incorporates the physical characteristics of sinograms, enhancing generalization and enabling robust performance across multiple subtasks spanning four benchmark datasets. Experimental results demonstrate thatUniSino achieves superior reconstruction quality both single and mixed undersampling case, demonstrating exceptional robustness and generalization in sinogram enhancement for CT imaging. The code is available at: https://github.com/yqx7150/UniSino.

CT Reconstruction Methodology In Silico Academic Lab Open Code Benchmark SOTA

Improving Interpretability in Alzheimer's Prediction via Joint Learning of ADAS-Cog Scores

Nur Amirah Abd Hamid, Mohd Shahrizal Rusli, Muhammad Thaqif Iman Mohd Taufek, Mohd Ibrahim Shapiai, Daphne Teck Ching Lai

•preprint•Aug 25 2025

Accurate prediction of clinical scores is critical for early detection and prognosis of Alzheimers disease (AD). While existing approaches primarily focus on forecasting the ADAS-Cog global score, they often overlook the predictive value of its sub-scores (13 items), which capture domain-specific cognitive decline. In this study, we propose a multi task learning (MTL) framework that jointly predicts the global ADAS-Cog score and its sub-scores (13 items) at Month 24 using baseline MRI and longitudinal clinical scores from baseline and Month 6. The main goal is to examine how each sub scores particularly those associated with MRI features contribute to the prediction of the global score, an aspect largely neglected in prior MTL studies. We employ Vision Transformer (ViT) and Swin Transformer architectures to extract imaging features, which are fused with longitudinal clinical inputs to model cognitive progression. Our results show that incorporating sub-score learning improves global score prediction. Subscore level analysis reveals that a small subset especially Q1 (Word Recall), Q4 (Delayed Recall), and Q8 (Word Recognition) consistently dominates the predicted global score. However, some of these influential sub-scores exhibit high prediction errors, pointing to model instability. Further analysis suggests that this is caused by clinical feature dominance, where the model prioritizes easily predictable clinical scores over more complex MRI derived features. These findings emphasize the need for improved multimodal fusion and adaptive loss weighting to achieve more balanced learning. Our study demonstrates the value of sub score informed modeling and provides insights into building more interpretable and clinically robust AD prediction frameworks. (Github repo provided)

MRI Registration Neurological Methodology In Silico Academic Lab Open Code

Diffusion-Based Data Augmentation for Medical Image Segmentation

Maham Nazir, Muhammad Aqeel, Francesco Setti

•preprint•Aug 25 2025

Medical image segmentation models struggle with rare abnormalities due to scarce annotated pathological data. We propose DiffAug a novel framework that combines textguided diffusion-based generation with automatic segmentation validation to address this challenge. Our proposed approach uses latent diffusion models conditioned on medical text descriptions and spatial masks to synthesize abnormalities via inpainting on normal images. Generated samples undergo dynamic quality validation through a latentspace segmentation network that ensures accurate localization while enabling single-step inference. The text prompts, derived from medical literature, guide the generation of diverse abnormality types without requiring manual annotation. Our validation mechanism filters synthetic samples based on spatial accuracy, maintaining quality while operating efficiently through direct latent estimation. Evaluated on three medical imaging benchmarks (CVC-ClinicDB, Kvasir-SEG, REFUGE2), our framework achieves state-of-the-art performance with 8-10% Dice improvements over baselines and reduces false negative rates by up to 28% for challenging cases like small polyps and flat lesions critical for early detection in screening applications.

Mixed Modality Segmentation Methodology In Silico Benchmark SOTA

Filter Papers

Tags

Breast Cancer Diagnosis Using a Dual-Modality Complementary Deep Learning Network With Integrated Attention Mechanism Fusion of B-Mode Ultrasound and Shear Wave Elastography.

Motion Management in Positron Emission Tomography/Computed Tomography and Positron Emission Tomography/Magnetic Resonance.

Multimodal Positron Emission Tomography/Computed Tomography Radiomics Combined with a Clinical Model for Preoperative Prediction of Invasive Pulmonary Adenocarcinoma in Ground-Glass Nodules.

2D Ultrasound Elasticity Imaging of Abdominal Aortic Aneurysms Using Deep Neural Networks

Emerging Semantic Segmentation from Positive and Negative Coarse Label Learning

FCR: Investigating Generative AI models for Forensic Craniofacial Reconstruction

Why Relational Graphs Will Save the Next Generation of Vision Foundation Models?

UniSino: Physics-Driven Foundational Model for Universal CT Sinogram Standardization

Improving Interpretability in Alzheimer's Prediction via Joint Learning of ADAS-Cog Scores

Diffusion-Based Data Augmentation for Medical Image Segmentation

Ready to Sharpen Your Edge?