Latest Papers on Radiology AI.

FCR: Investigating Generative AI models for Forensic Craniofacial Reconstruction

Ravi Shankar Prasad, Dinesh Singh

•preprint•Aug 25 2025

Craniofacial reconstruction in forensics is one of the processes to identify victims of crime and natural disasters. Identifying an individual from their remains plays a crucial role when all other identification methods fail. Traditional methods for this task, such as clay-based craniofacial reconstruction, require expert domain knowledge and are a time-consuming process. At the same time, other probabilistic generative models like the statistical shape model or the Basel face model fail to capture the skull and face cross-domain attributes. Looking at these limitations, we propose a generic framework for craniofacial reconstruction from 2D X-ray images. Here, we used various generative models (i.e., CycleGANs, cGANs, etc) and fine-tune the generator and discriminator parts to generate more realistic images in two distinct domains, which are the skull and face of an individual. This is the first time where 2D X-rays are being used as a representation of the skull by generative models for craniofacial reconstruction. We have evaluated the quality of generated faces using FID, IS, and SSIM scores. Finally, we have proposed a retrieval framework where the query is the generated face image and the gallery is the database of real faces. By experimental results, we have found that this can be an effective tool for forensic science.

X-Ray Image Synthesis Methodology In Silico GenAI

Why Relational Graphs Will Save the Next Generation of Vision Foundation Models?

Fatemeh Ziaeetabar

•preprint•Aug 25 2025

Vision foundation models (FMs) have become the predominant architecture in computer vision, providing highly transferable representations learned from large-scale, multimodal corpora. Nonetheless, they exhibit persistent limitations on tasks that require explicit reasoning over entities, roles, and spatio-temporal relations. Such relational competence is indispensable for fine-grained human activity recognition, egocentric video understanding, and multimodal medical image analysis, where spatial, temporal, and semantic dependencies are decisive for performance. We advance the position that next-generation FMs should incorporate explicit relational interfaces, instantiated as dynamic relational graphs (graphs whose topology and edge semantics are inferred from the input and task context). We illustrate this position with cross-domain evidence from recent systems in human manipulation action recognition and brain tumor segmentation, showing that augmenting FMs with lightweight, context-adaptive graph-reasoning modules improves fine-grained semantic fidelity, out of distribution robustness, interpretability, and computational efficiency relative to FM only baselines. Importantly, by reasoning sparsely over semantic nodes, such hybrids also achieve favorable memory and hardware efficiency, enabling deployment under practical resource constraints. We conclude with a targeted research agenda for FM graph hybrids, prioritizing learned dynamic graph construction, multi-level relational reasoning (e.g., part object scene in activity understanding, or region organ in medical imaging), cross-modal fusion, and evaluation protocols that directly probe relational competence in structured vision tasks.

MRI Segmentation Neurological Review Concept GenAI

UniSino: Physics-Driven Foundational Model for Universal CT Sinogram Standardization

Xingyu Ai, Shaoyu Wang, Zhiyuan Jia, Ao Xu, Hongming Shan, Jianhua Ma, Qiegen Liu

•preprint•Aug 25 2025

During raw-data acquisition in CT imaging, diverse factors can degrade the collected sinograms, with undersampling and noise leading to severe artifacts and noise in reconstructed images and compromising diagnostic accuracy. Conventional correction methods rely on manually designed algorithms or fixed empirical parameters, but these approaches often lack generalizability across heterogeneous artifact types. To address these limitations, we propose UniSino, a foundation model for universal CT sinogram standardization. Unlike existing foundational models that operate in image domain, UniSino directly standardizes data in the projection domain, which enables stronger generalization across diverse undersampling scenarios. Its training framework incorporates the physical characteristics of sinograms, enhancing generalization and enabling robust performance across multiple subtasks spanning four benchmark datasets. Experimental results demonstrate thatUniSino achieves superior reconstruction quality both single and mixed undersampling case, demonstrating exceptional robustness and generalization in sinogram enhancement for CT imaging. The code is available at: https://github.com/yqx7150/UniSino.

CT Reconstruction Methodology In Silico Academic Lab Open Code Benchmark SOTA

Improving Interpretability in Alzheimer's Prediction via Joint Learning of ADAS-Cog Scores

Nur Amirah Abd Hamid, Mohd Shahrizal Rusli, Muhammad Thaqif Iman Mohd Taufek, Mohd Ibrahim Shapiai, Daphne Teck Ching Lai

•preprint•Aug 25 2025

Accurate prediction of clinical scores is critical for early detection and prognosis of Alzheimers disease (AD). While existing approaches primarily focus on forecasting the ADAS-Cog global score, they often overlook the predictive value of its sub-scores (13 items), which capture domain-specific cognitive decline. In this study, we propose a multi task learning (MTL) framework that jointly predicts the global ADAS-Cog score and its sub-scores (13 items) at Month 24 using baseline MRI and longitudinal clinical scores from baseline and Month 6. The main goal is to examine how each sub scores particularly those associated with MRI features contribute to the prediction of the global score, an aspect largely neglected in prior MTL studies. We employ Vision Transformer (ViT) and Swin Transformer architectures to extract imaging features, which are fused with longitudinal clinical inputs to model cognitive progression. Our results show that incorporating sub-score learning improves global score prediction. Subscore level analysis reveals that a small subset especially Q1 (Word Recall), Q4 (Delayed Recall), and Q8 (Word Recognition) consistently dominates the predicted global score. However, some of these influential sub-scores exhibit high prediction errors, pointing to model instability. Further analysis suggests that this is caused by clinical feature dominance, where the model prioritizes easily predictable clinical scores over more complex MRI derived features. These findings emphasize the need for improved multimodal fusion and adaptive loss weighting to achieve more balanced learning. Our study demonstrates the value of sub score informed modeling and provides insights into building more interpretable and clinically robust AD prediction frameworks. (Github repo provided)

MRI Registration Neurological Methodology In Silico Academic Lab Open Code

Diffusion-Based Data Augmentation for Medical Image Segmentation

Maham Nazir, Muhammad Aqeel, Francesco Setti

•preprint•Aug 25 2025

Medical image segmentation models struggle with rare abnormalities due to scarce annotated pathological data. We propose DiffAug a novel framework that combines textguided diffusion-based generation with automatic segmentation validation to address this challenge. Our proposed approach uses latent diffusion models conditioned on medical text descriptions and spatial masks to synthesize abnormalities via inpainting on normal images. Generated samples undergo dynamic quality validation through a latentspace segmentation network that ensures accurate localization while enabling single-step inference. The text prompts, derived from medical literature, guide the generation of diverse abnormality types without requiring manual annotation. Our validation mechanism filters synthetic samples based on spatial accuracy, maintaining quality while operating efficiently through direct latent estimation. Evaluated on three medical imaging benchmarks (CVC-ClinicDB, Kvasir-SEG, REFUGE2), our framework achieves state-of-the-art performance with 8-10% Dice improvements over baselines and reduces false negative rates by up to 28% for challenging cases like small polyps and flat lesions critical for early detection in screening applications.

Mixed Modality Segmentation Methodology In Silico Benchmark SOTA

ControlEchoSynth: Boosting Ejection Fraction Estimation Models via Controlled Video Diffusion

Nima Kondori, Hanwen Liang, Hooman Vaseli, Bingyu Xie, Christina Luong, Purang Abolmaesumi, Teresa Tsang, Renjie Liao

•preprint•Aug 25 2025

Synthetic data generation represents a significant advancement in boosting the performance of machine learning (ML) models, particularly in fields where data acquisition is challenging, such as echocardiography. The acquisition and labeling of echocardiograms (echo) for heart assessment, crucial in point-of-care ultrasound (POCUS) settings, often encounter limitations due to the restricted number of echo views available, typically captured by operators with varying levels of experience. This study proposes a novel approach for enhancing clinical diagnosis accuracy by synthetically generating echo views. These views are conditioned on existing, real views of the heart, focusing specifically on the estimation of ejection fraction (EF), a critical parameter traditionally measured from biplane apical views. By integrating a conditional generative model, we demonstrate an improvement in EF estimation accuracy, providing a comparative analysis with traditional methods. Preliminary results indicate that our synthetic echoes, when used to augment existing datasets, not only enhance EF estimation but also show potential in advancing the development of more robust, accurate, and clinically relevant ML models. This approach is anticipated to catalyze further research in synthetic data applications, paving the way for innovative solutions in medical imaging diagnostics.

Ultrasound Image Synthesis Cardiac Methodology In Silico GenAI

Benchmarking Class Activation Map Methods for Explainable Brain Hemorrhage Classification on Hemorica Dataset

Z. Rafati, M. Hoseyni, J. Khoramdel, A. Nikoofard

•preprint•Aug 25 2025

Explainable Artificial Intelligence (XAI) has become an essential component of medical imaging research, aiming to increase transparency and clinical trust in deep learning models. This study investigates brain hemorrhage diagnosis with a focus on explainability through Class Activation Mapping (CAM) techniques. A pipeline was developed to extract pixellevel segmentation and detection annotations from classification models using nine state-of-the-art CAM algorithms, applied across multiple network stages, and quantitatively evaluated on the Hemorica dataset, which uniquely provides both slice-level labels and high-quality segmentation masks. Metrics including Dice, IoU, and pixel-wise overlap were employed to benchmark CAM variants. Results show that the strongest localization performance occurred at stage 5 of EfficientNetV2S, with HiResCAM yielding the highest bounding-box alignment and AblationCAM achieving the best pixel-level Dice (0.57) and IoU (0.40), representing strong accuracy given that models were trained solely for classification without segmentation supervision. To the best of current knowledge, this is among the f irst works to quantitatively compare CAM methods for brain hemorrhage detection, establishing a reproducible benchmark and underscoring the potential of XAI-driven pipelines for clinically meaningful AI-assisted diagnosis.

CT Classification Neurological Methodology In Silico Academic Lab Benchmark SOTA Reproducibility

A Weighted Vision Transformer-Based Multi-Task Learning Framework for Predicting ADAS-Cog Scores

Nur Amirah Abd Hamid, Mohd Ibrahim Shapiai, Daphne Teck Ching Lai

•preprint•Aug 25 2025

Prognostic modeling is essential for forecasting future clinical scores and enabling early detection of Alzheimers disease (AD). While most existing methods focus on predicting the ADAS-Cog global score, they often overlook the predictive value of its 13 sub-scores, which reflect distinct cognitive domains. Some sub-scores may exert greater influence on determining global scores. Assigning higher loss weights to these clinically meaningful sub-scores can guide the model to focus on more relevant cognitive domains, enhancing both predictive accuracy and interpretability. In this study, we propose a weighted Vision Transformer (ViT)-based multi-task learning (MTL) framework to jointly predict the ADAS-Cog global score using baseline MRI scans and its 13 sub-scores at Month 24. Our framework integrates ViT as a feature extractor and systematically investigates the impact of sub-score-specific loss weighting on model performance. Results show that our proposed weighting strategies are group-dependent: strong weighting improves performance for MCI subjects with more heterogeneous MRI patterns, while moderate weighting is more effective for CN subjects with lower variability. Our findings suggest that uniform weighting underutilizes key sub-scores and limits generalization. The proposed framework offers a flexible, interpretable approach to AD prognosis using end-to-end MRI-based learning. (Github repo link will be provided after review)

MRI Registration Neurological Methodology In Silico GenAI Open Code

Validation of automated computed tomography segmentation software to assess body composition among cancer patients.

Salehin M, Yang Chow VT, Lee H, Weltzien EK, Nguyen L, Li JM, Akella V, Caan BJ, Cespedes Feliciano EM, Ma D, Beg MF, Popuri K

•papers•Aug 25 2025

Assessing body composition using computed tomography (CT) can help predict the clinical outcomes of cancer patients, including surgical complications, chemotherapy toxicity, and survival. However, manual segmentation of CT images is labor-intensive and can lead to significant inter-observer variability. In this study, we validate the accuracy and reliability of automatic CT-based segmentation using the Data Analysis Facilitation Suite (DAFS) Express software package, which rapidly segments single CT slices. The study analyzed single-slice images at the third lumbar vertebra (L3) level (n = 5973) of patients diagnosed with non-metastatic colorectal (n = 3098) and breast cancer (n = 2875) at Kaiser Permanente Northern California. Manual segmentation used SliceOmatic with Alberta protocol HU ranges; automated segmentation used DAFS Express with identical HU limits. The accuracy of the automated segmentation was evaluated using the DICE index, the reliability was assessed by intra-class correlation coefficients (ICC) with 95% CI, and the agreement between automatic and manual segmentations was assessed by Bland-Altman analysis. DICE scores below 20% and 70% were considered failed and poor segmentations, respectively, and underwent additional review. The mortality risk associated with each tissue's area was generated using Cox proportional hazard ratios (HR) with 95% CI, adjusted for patient-specific variables including age, sex, race/ethnicity, cancer stage and grade, treatment receipt, and smoking status. A blinded review process categorized images with various characteristics for sensitivity analysis. The mean (standard deviation, SD) ages of the colorectal and breast cancer patients were 62.6 (11.4) and 56 (11.8), respectively. Automatic segmentation showed high accuracy vs. manual segmentation, with mean DICE scores above 96% for skeletal muscle (SKM), visceral adipose tissue (VAT), and subcutaneous adipose tissue (SAT), and above 77% for intermuscular adipose tissue (IMAT), with three failures, representing 0.05% of the cohort. Bland-Altman analysis of 5,973 measurements showed mean cross-sectional area differences of -5.73, -0.84, -2.82, and -1.02 cm<sup>2</sup> for SKM, VAT, SAT and IMAT, respectively, indicating good agreement, with slight underestimation in SKM and SAT. Reliability Coefficients ranged from 0.88-1.00 for colorectal and 0.95-1.00 for breast cancer, with Simple Kappa values of 0.65-0.99 and 0.67-0.97, respectively. Additionally, mortality associations for automated and manual segmentations were similar, with comparable hazard ratios, confidence intervals, and p-values. Kaplan-Meier survival estimates showed mortality differences below 2.14%. DAFS Express enables rapid, accurate body composition analysis by automating segmentation, reducing expert time and computational burden. This rapid analysis of body composition is a prerequisite to large-scale research that could potentially enable use in the clinical setting. Automated CT segmentations may be utilized to assess markers of sarcopenia, muscle loss, and adiposity and predict clinical outcomes.

CT Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Efficient 3D Biomedical Image Segmentation by Parallelly Multiscale Transformer-CNN Aggregation Network.

Liu W, He Y, Man T, Zhu F, Chen Q, Huang Y, Feng X, Li B, Wan Y, He J, Deng S

•papers•Aug 25 2025

Accurate and automated segmentation of 3D biomedical images is a sophisticated imperative in clinical diagnosis, imaging-guided surgery, and prognosis judgment. Although the burgeoning of deep learning technologies has fostered smart segmentators, the successive and simultaneous garnering global and local features still remains challenging, which is essential for an exact and efficient imageological assay. To this end, a segmentation solution dubbed the mixed parallel shunted transformer (MPSTrans) is developed here, highlighting 3D-MPST blocks in a U-form framework. It enabled not only comprehensive characteristic capture and multiscale slice synchronization but also deep supervision in the decoder to facilitate the fetching of hierarchical representations. Performing on an unpublished colon cancer data set, this model achieved an impressive increase in dice similarity coefficient (DSC) and a 1.718 mm decease in Hausdorff distance at 95% (HD95), alongside a substantial shrink of computational load of 56.7% in giga floating-point operations per second (GFLOPs). Meanwhile, MPSTrans outperforms other mainstream methods (Swin UNETR, UNETR, nnU-Net, PHTrans, and 3D U-Net) on three public multiorgan (aorta, gallbladder, kidney, liver, pancreas, spleen, stomach, etc.) and multimodal (CT, PET-CT, and MRI) data sets of medical segmentation decathlon (MSD) brain tumor, multiatlas labeling beyond cranial vault (BCV), and automated cardiac diagnosis challenge (ACDC), accentuating its adaptability. These results reflect the potential of MPSTrans to advance the state-of-the-art in biomedical imaging analysis, which would offer a robust tool for enhanced diagnostic capacity.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA

Filter Papers

Tags

FCR: Investigating Generative AI models for Forensic Craniofacial Reconstruction

Why Relational Graphs Will Save the Next Generation of Vision Foundation Models?

UniSino: Physics-Driven Foundational Model for Universal CT Sinogram Standardization

Improving Interpretability in Alzheimer's Prediction via Joint Learning of ADAS-Cog Scores

Diffusion-Based Data Augmentation for Medical Image Segmentation

ControlEchoSynth: Boosting Ejection Fraction Estimation Models via Controlled Video Diffusion

Benchmarking Class Activation Map Methods for Explainable Brain Hemorrhage Classification on Hemorica Dataset

A Weighted Vision Transformer-Based Multi-Task Learning Framework for Predicting ADAS-Cog Scores

Validation of automated computed tomography segmentation software to assess body composition among cancer patients.

Efficient 3D Biomedical Image Segmentation by Parallelly Multiscale Transformer-CNN Aggregation Network.

Ready to Sharpen Your Edge?