Latest Papers on Radiology AI. Category: preprint, Order: Best Match, Limit: 10.

GRASPing Anatomy to Improve Pathology Segmentation

Keyi Li, Alexander Jaus, Jens Kleesiek, Rainer Stiefelhagen

•preprint•Aug 5 2025

Radiologists rely on anatomical understanding to accurately delineate pathologies, yet most current deep learning approaches use pure pattern recognition and ignore the anatomical context in which pathologies develop. To narrow this gap, we introduce GRASP (Guided Representation Alignment for the Segmentation of Pathologies), a modular plug-and-play framework that enhances pathology segmentation models by leveraging existing anatomy segmentation models through pseudolabel integration and feature alignment. Unlike previous approaches that obtain anatomical knowledge via auxiliary training, GRASP integrates into standard pathology optimization regimes without retraining anatomical components. We evaluate GRASP on two PET/CT datasets, conduct systematic ablation studies, and investigate the framework's inner workings. We find that GRASP consistently achieves top rankings across multiple evaluation metrics and diverse architectures. The framework's dual anatomy injection strategy, combining anatomical pseudo-labels as input channels with transformer-guided anatomical feature fusion, effectively incorporates anatomical context.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA

A Survey of Medical Point Cloud Shape Learning: Registration, Reconstruction and Variation

Tongxu Zhang, Zhiming Liang, Bei Wang

•preprint•Aug 5 2025

Point clouds have become an increasingly important representation for 3D medical imaging, offering a compact, surface-preserving alternative to traditional voxel or mesh-based approaches. Recent advances in deep learning have enabled rapid progress in extracting, modeling, and analyzing anatomical shapes directly from point cloud data. This paper provides a comprehensive and systematic survey of learning-based shape analysis for medical point clouds, focusing on three fundamental tasks: registration, reconstruction, and variation modeling. We review recent literature from 2021 to 2025, summarize representative methods, datasets, and evaluation metrics, and highlight clinical applications and unique challenges in the medical domain. Key trends include the integration of hybrid representations, large-scale self-supervised models, and generative techniques. We also discuss current limitations, such as data scarcity, inter-patient variability, and the need for interpretable and robust solutions for clinical deployment. Finally, future directions are outlined for advancing point cloud-based shape learning in medical imaging.

Mixed Modality Registration Review Concept Academic Lab Benchmark SOTA

ClinicalFMamba: Advancing Clinical Assessment using Mamba-based Multimodal Neuroimaging Fusion

Meng Zhou, Farzad Khalvati

•preprint•Aug 5 2025

Multimodal medical image fusion integrates complementary information from different imaging modalities to enhance diagnostic accuracy and treatment planning. While deep learning methods have advanced performance, existing approaches face critical limitations: Convolutional Neural Networks (CNNs) excel at local feature extraction but struggle to model global context effectively, while Transformers achieve superior long-range modeling at the cost of quadratic computational complexity, limiting clinical deployment. Recent State Space Models (SSMs) offer a promising alternative, enabling efficient long-range dependency modeling in linear time through selective scan mechanisms. Despite these advances, the extension to 3D volumetric data and the clinical validation of fused images remains underexplored. In this work, we propose ClinicalFMamba, a novel end-to-end CNN-Mamba hybrid architecture that synergistically combines local and global feature modeling for 2D and 3D images. We further design a tri-plane scanning strategy for effectively learning volumetric dependencies in 3D images. Comprehensive evaluations on three datasets demonstrate the superior fusion performance across multiple quantitative metrics while achieving real-time fusion. We further validate the clinical utility of our approach on downstream 2D/3D brain tumor classification tasks, achieving superior performance over baseline methods. Our method establishes a new paradigm for efficient multimodal medical image fusion suitable for real-time clinical deployment.

Mixed Modality Image Synthesis Neurological Methodology In Silico Benchmark SOTA

GL-LCM: Global-Local Latent Consistency Models for Fast High-Resolution Bone Suppression in Chest X-Ray Images

Yifei Sun, Zhanghao Chen, Hao Zheng, Yuqing Lu, Lixin Duan, Fenglei Fan, Ahmed Elazab, Xiang Wan, Changmiao Wang, Ruiquan Ge

•preprint•Aug 5 2025

Chest X-Ray (CXR) imaging for pulmonary diagnosis raises significant challenges, primarily because bone structures can obscure critical details necessary for accurate diagnosis. Recent advances in deep learning, particularly with diffusion models, offer significant promise for effectively minimizing the visibility of bone structures in CXR images, thereby improving clarity and diagnostic accuracy. Nevertheless, existing diffusion-based methods for bone suppression in CXR imaging struggle to balance the complete suppression of bones with preserving local texture details. Additionally, their high computational demand and extended processing time hinder their practical use in clinical settings. To address these limitations, we introduce a Global-Local Latent Consistency Model (GL-LCM) architecture. This model combines lung segmentation, dual-path sampling, and global-local fusion, enabling fast high-resolution bone suppression in CXR images. To tackle potential boundary artifacts and detail blurring in local-path sampling, we further propose Local-Enhanced Guidance, which addresses these issues without additional training. Comprehensive experiments on a self-collected dataset SZCH-X-Rays, and the public dataset JSRT, reveal that our GL-LCM delivers superior bone suppression and remarkable computational efficiency, significantly outperforming several competitive methods. Our code is available at https://github.com/diaoquesang/GL-LCM.

X-Ray Image Synthesis Chest Methodology In Silico Academic Lab Open Code

Augmenting Continual Learning of Diseases with LLM-Generated Visual Concepts

Jiantao Tan, Peixian Ma, Kanghao Chen, Zhiming Dai, Ruixuan Wang

•preprint•Aug 5 2025

Continual learning is essential for medical image classification systems to adapt to dynamically evolving clinical environments. The integration of multimodal information can significantly enhance continual learning of image classes. However, while existing approaches do utilize textual modality information, they solely rely on simplistic templates with a class name, thereby neglecting richer semantic information. To address these limitations, we propose a novel framework that harnesses visual concepts generated by large language models (LLMs) as discriminative semantic guidance. Our method dynamically constructs a visual concept pool with a similarity-based filtering mechanism to prevent redundancy. Then, to integrate the concepts into the continual learning process, we employ a cross-modal image-concept attention module, coupled with an attention loss. Through attention, the module can leverage the semantic knowledge from relevant visual concepts and produce class-representative fused features for classification. Experiments on medical and natural image datasets show our method achieves state-of-the-art performance, demonstrating the effectiveness and superiority of our method. We will release the code publicly.

Mixed Modality Classification Methodology In Silico Academic Lab Open Code GenAI

A Novel Multimodal Framework for Early Detection of Alzheimers Disease Using Deep Learning

Tatwadarshi P Nagarhalli, Sanket Patil, Vishal Pande, Uday Aswalekar, Prafulla Patil

•preprint•Aug 5 2025

Alzheimers Disease (AD) is a progressive neurodegenerative disorder that poses significant challenges in its early diagnosis, often leading to delayed treatment and poorer outcomes for patients. Traditional diagnostic methods, typically reliant on single data modalities, fall short of capturing the multifaceted nature of the disease. In this paper, we propose a novel multimodal framework for the early detection of AD that integrates data from three primary sources: MRI imaging, cognitive assessments, and biomarkers. This framework employs Convolutional Neural Networks (CNN) for analyzing MRI images and Long Short-Term Memory (LSTM) networks for processing cognitive and biomarker data. The system enhances diagnostic accuracy and reliability by aggregating results from these distinct modalities using advanced techniques like weighted averaging, even in incomplete data. The multimodal approach not only improves the robustness of the detection process but also enables the identification of AD at its earliest stages, offering a significant advantage over conventional methods. The integration of biomarkers and cognitive tests is particularly crucial, as these can detect Alzheimer's long before the onset of clinical symptoms, thereby facilitating earlier intervention and potentially altering the course of the disease. This research demonstrates that the proposed framework has the potential to revolutionize the early detection of AD, paving the way for more timely and effective treatments

MRI Detection Neurological Methodology In Silico

Policy to Assist Iteratively Local Segmentation: Optimising Modality and Location Selection for Prostate Cancer Localisation

Xiangcen Wu, Shaheer U. Saeed, Yipei Wang, Ester Bonmati Coll, Yipeng Hu

•preprint•Aug 5 2025

Radiologists often mix medical image reading strategies, including inspection of individual modalities and local image regions, using information at different locations from different images independently as well as concurrently. In this paper, we propose a recommend system to assist machine learning-based segmentation models, by suggesting appropriate image portions along with the best modality, such that prostate cancer segmentation performance can be maximised. Our approach trains a policy network that assists tumor localisation, by recommending both the optimal imaging modality and the specific sections of interest for review. During training, a pre-trained segmentation network mimics radiologist inspection on individual or variable combinations of these imaging modalities and their sections - selected by the policy network. Taking the locally segmented regions as an input for the next step, this dynamic decision making process iterates until all cancers are best localised. We validate our method using a data set of 1325 labelled multiparametric MRI images from prostate cancer patients, demonstrating its potential to improve annotation efficiency and segmentation accuracy, especially when challenging pathology is present. Experimental results show that our approach can surpass standard segmentation networks. Perhaps more interestingly, our trained agent independently developed its own optimal strategy, which may or may not be consistent with current radiologist guidelines such as PI-RADS. This observation also suggests a promising interactive application, in which the proposed policy networks assist human radiologists.

MRI Segmentation Abdominal Methodology In Silico

Point-Based Shape Representation Generation with a Correspondence-Preserving Diffusion Model

Shen Zhu, Yinzhu Jin, Ifrah Zawar, P. Thomas Fletcher

•preprint•Aug 5 2025

We propose a diffusion model designed to generate point-based shape representations with correspondences. Traditional statistical shape models have considered point correspondences extensively, but current deep learning methods do not take them into account, focusing on unordered point clouds instead. Current deep generative models for point clouds do not address generating shapes with point correspondences between generated shapes. This work aims to formulate a diffusion model that is capable of generating realistic point-based shape representations, which preserve point correspondences that are present in the training data. Using shape representation data with correspondences derived from Open Access Series of Imaging Studies 3 (OASIS-3), we demonstrate that our correspondence-preserving model effectively generates point-based hippocampal shape representations that are highly realistic compared to existing methods. We further demonstrate the applications of our generative model by downstream tasks, such as conditional generation of healthy and AD subjects and predicting morphological changes of disease progression by counterfactual generation.

MRI Image Synthesis Neurological Methodology In Silico Academic Lab GenAI

Modeling differences in neurodevelopmental maturity of the reading network using support vector regression on functional connectivity data

Lasnick, O. H. M., Luo, J., Kinnie, B., Kamal, S., Low, S., Marrouch, N., Hoeft, F.

•preprint•Aug 5 2025

The construction of growth charts trained to predict age or developmental deviation (the brain-age index) based on structural/functional properties of the brain may be informative of childrens neurodevelopmental trajectories. When applied to both typically and atypically developing populations, results may indicate that a particular condition is associated with atypical maturation of certain brain networks. Here, we focus on the relationship between reading disorder (RD) and maturation of functional connectivity (FC) patterns in the prototypical reading/language network using a cross-sectional sample of N = 742 participants aged 6-21 years. A support vector regression model is trained to predict chronological age from FC data derived from a whole-brain model as well as multiple reduced models, which are trained on FC data generated from a successively smaller number of regions in the brains reading network. We hypothesized that the trained models would show systematic underestimation of brain network maturity for poor readers, particularly for the models trained with reading/language regions. Comparisons of the different models predictions revealed that while the whole-brain model outperforms the others in terms of overall prediction accuracy, all models successfully predicted brain maturity, including the one trained with the smallest amount of FC data. In addition, all models showed that reading ability affected the brain-age gap, with poor readers ages being underestimated and advanced readers ages being overestimated. Exploratory results demonstrated that the most important regions and connections for prediction were derived from the default mode and frontoparietal control networks. GlossaryDevelopmental dyslexia / reading disorder (RD): A specific learning disorder affecting reading ability in the absence of any other explanatory condition such as intellectual disability or visual impairment Support vector regression (SVR): A supervised machine learning technique which predicts continuous outcomes (such as chronological age) rather than classifying each observation; finds the best-fit function within a defined error margin Principal component analysis (PCA): A dimensionality reduction technique that transforms a high-dimensional dataset with many features per observation into a reduced set of principal components for each observation; each component is a linear combination of several original (correlated) features, and the final set of components are all orthogonal (uncorrelated) to one another Brain-age index: A numerical index quantifying deviation from the brains typical developmental trajectory for a single individual; may be based on a variety of morphometric or functional properties of the brain, resulting in different estimates for the same participant depending on the imaging modality used Brain-age gap (BAG): The difference, given in units of time, between a participants true chronological age and a predictive models estimated age for that participant based on brain data (Actual - Predicted); may be used as a brain-age index HighlightsO_LIA machine learning model trained on functional data predicted participants ages C_LIO_LIThe model showed variability in age prediction accuracy based on reading skills C_LIO_LIThe model highly weighted data from frontoparietal and default mode regions C_LIO_LINeural markers of reading and language are diffusely represented in the brain C_LI

MRI Registration Neurological Retrospective Clinical In Silico Academic Lab

Delineating retinal breaks in ultra-widefield fundus images with a PraNet-based machine learning model

Takayama, T., Uto, T., Tsuge, T., Kondo, Y., Tampo, H., Chiba, M., Kaburaki, T., Yanagi, Y., Takahashi, H.

•preprint•Aug 5 2025

BackgroundRetinal breaks are critical lesions that can lead to retinal detachment and vision loss if not detected and treated early. Automated and precise delineation of retinal breaks using ultra- widefield fundus (UWF) images remain a significant challenge in ophthalmology. ObjectiveThis study aimed to develop and validate a deep learning model based on the PraNet architecture for the accurate delineation of retinal breaks in UWF images, with a particular focus on segmentation performance in retinal break-positive cases. MethodsWe developed a deep learning segmentation model based on the PraNet architecture. This study utilized a dataset consisting of 8,083 cases and a total of 34,867 UWF images. Of these, 960 images contained retinal breaks, while the remaining 33,907 images did not. The dataset was split into 34,713 images for training, 81 for validation, and 73 for testing. The model was trained and validated on this dataset. Model performance was evaluated using both image-wise segmentation metrics (accuracy, precision, recall, Intersection over Union (IoU), dice score, centroid distance score) and lesion-wise detection metrics (sensitivity, positive predictive value). ResultsThe PraNet-based model achieved an accuracy of 0.996, a precision of 0.635, a recall of 0.756, an IoU of 0.539, a dice score of 0.652, and a centroid distance score of 0.081 for pixel-level detection of retinal breaks. The lesion-wise sensitivity was calculated as 0.885, and the positive predictive value (PPV) was 0.742. ConclusionsTo our knowledge, this is the first study to present pixel-level localization of retinal breaks using deep learning on UWF images. Our findings demonstrate that the PraNet-based model provides precise and robust pixel-level segmentation of retinal breaks in UWF images. This approach offers a clinically applicable tool for the precise delineation of retinal breaks, with the potential to improve patient outcomes. Future work should focus on external validation across multiple institutions and integration of additional annotation strategies to further enhance model performance and generalizability.

OCT Segmentation Methodology In Silico Academic Lab

GRASPing Anatomy to Improve Pathology Segmentation

A Survey of Medical Point Cloud Shape Learning: Registration, Reconstruction and Variation

ClinicalFMamba: Advancing Clinical Assessment using Mamba-based Multimodal Neuroimaging Fusion

GL-LCM: Global-Local Latent Consistency Models for Fast High-Resolution Bone Suppression in Chest X-Ray Images

Augmenting Continual Learning of Diseases with LLM-Generated Visual Concepts

A Novel Multimodal Framework for Early Detection of Alzheimers Disease Using Deep Learning

Policy to Assist Iteratively Local Segmentation: Optimising Modality and Location Selection for Prostate Cancer Localisation

Point-Based Shape Representation Generation with a Correspondence-Preserving Diffusion Model

Modeling differences in neurodevelopmental maturity of the reading network using support vector regression on functional connectivity data

Delineating retinal breaks in ultra-widefield fundus images with a PraNet-based machine learning model

Ready to Sharpen Your Edge?