Latest Papers on Radiology AI. Tags: None

A novel lung cancer diagnosis model using hybrid convolution (2D/3D)-based adaptive DenseUnet with attention mechanism.

Deepa J, Badhu Sasikala L, Indumathy P, Jerrin Simla A

•papers•Aug 5 2025

Existing Lung Cancer Diagnosis (LCD) models have difficulty in detecting early-stage lung cancer due to the asymptomatic nature of the disease which leads to an increased death rate of patients. Therefore, it is important to diagnose lung disease at an early stage to save the lives of affected persons. Hence, the research work aims to develop an efficient lung disease diagnosis using deep learning techniques for the early and accurate detection of lung cancer. This is achieved by. Initially, the proposed model collects the mandatory CT images from the standard benchmark datasets. Then, the lung cancer segmentation is done by using the development of Hybrid Convolution (2D/3D)-based Adaptive DenseUnet with Attention mechanism (HC-ADAM). The Hybrid Sewing Training with Spider Monkey Optimization (HSTSMO) is introduced to optimize the parameters in the developed HC-ADAM segmentation approach. Finally, the dissected lung nodule imagery is considered for the lung cancer classification stage, where the Hybrid Adaptive Dilated Networks with Attention mechanism (HADN-AM) are implemented with the serial cascading of ResNet and Long Short Term Memory (LSTM) for attaining better categorization performance. The accuracy, precision, and F1-score of the developed model for the LIDC-IDRI dataset are 96.3%, 96.38%, and 96.36%, respectively.

CT Segmentation Chest Methodology In Silico Academic Lab

Real-time 3D US-CT fusion-based semi-automatic puncture robot system: clinical evaluation.

Nakayama M, Zhang B, Kuromatsu R, Nakano M, Noda Y, Kawaguchi T, Li Q, Maekawa Y, Fujie MG, Sugano S

•papers•Aug 5 2025

Conventional systems supporting percutaneous radiofrequency ablation (PRFA) have faced difficulties in ensuring safe and accurate puncture due to issues inherent to the medical images used and organ displacement caused by patients' respiration. To address this problem, this study proposes a semi-automatic puncture robot system that integrates real-time ultrasound (US) images with computed tomography (CT) images. The purpose of this paper is to evaluate the system's usefulness through a pilot clinical experiment involving participants. For the clinical experiment using the proposed system, an improved U-net model based on fivefold cross-validation was constructed. Following the workflow of the proposed system, the model was trained using US images acquired from patients with robotic arms. The average Dice coefficient for the entire validation dataset was confirmed to be 0.87. Therefore, the model was implemented in the robotic system and applied to clinical experiment. A clinical experiment was conducted using the robotic system equipped with the developed AI model on five adult male and female participants. The centroid distances between the point clouds from each modality were evaluated in the 3D US-CT fusion process, assuming the blood vessel centerline represents the overall structural position. The results of the centroid distances showed a minimum value of 0.38 mm, a maximum value of 4.81 mm, and an average of 1.97 mm. Although the five participants had different CP classifications and the derived US images exhibited individual variability, all centroid distances satisfied the ablation margin of 5.00 mm considered in PRFA, suggesting the potential accuracy and utility of the robotic system for puncture navigation. Additionally, the results suggested the potential generalization performance of the AI model trained with data acquired according to the robotic system's workflow.

Mixed Modality Segmentation Abdominal Prospective Clinical Pilot Academic Lab GenAI

Are Vision-xLSTM-embedded U-Nets better at segmenting medical images?

Dutta P, Bose S, Roy SK, Mitra S

•papers•Aug 5 2025

The development of efficient segmentation strategies for medical images has evolved from its initial dependence on Convolutional Neural Networks (CNNs) to the current investigation of hybrid models that combine CNNs with Vision Transformers (ViTs). There is an increasing focus on developing architectures that are both high-performing and computationally efficient, capable of being deployed on remote systems with limited resources. Although transformers can capture global dependencies in the input space, they face challenges from the corresponding high computational and storage expenses involved. The objective of this research is to propose that Vision Extended Long Short-Term Memory (Vision-xLSTM) forms an appropriate backbone for medical image segmentation, offering excellent performance with reduced computational costs. This study investigates the integration of CNNs with Vision-xLSTM by introducing the novel U-VixLSTM. The Vision-xLSTM blocks capture the temporal and global relationships within the patches extracted from the CNN feature maps. The convolutional feature reconstruction path upsamples the output volume from the Vision-xLSTM blocks to produce the segmentation output. The U-VixLSTM exhibits superior performance compared to the state-of-the-art networks in the publicly available Synapse, ISIC and ACDC datasets. The findings suggest that U-VixLSTM is a promising alternative to ViTs for medical image segmentation, delivering effective performance without substantial computational burden. This makes it feasible for deployment in healthcare environments with limited resources for faster diagnosis. Code provided: https://github.com/duttapallabi2907/U-VixLSTM.

Mixed Modality Segmentation Methodology In Silico Academic Lab Open Code

ERDES: A Benchmark Video Dataset for Retinal Detachment and Macular Status Classification in Ocular Ultrasound

Pouyan Navard, Yasemin Ozkut, Srikar Adhikari, Elaine Situ-LaCasse, Josie Acuña, Adrienne Yarnish, Alper Yilmaz

•preprint•Aug 5 2025

Retinal detachment (RD) is a vision-threatening condition that requires timely intervention to preserve vision. Macular involvement -- whether the macula is still intact (macula-intact) or detached (macula-detached) -- is the key determinant of visual outcomes and treatment urgency. Point-of-care ultrasound (POCUS) offers a fast, non-invasive, cost-effective, and accessible imaging modality widely used in diverse clinical settings to detect RD. However, ultrasound image interpretation is limited by a lack of expertise among healthcare providers, especially in resource-limited settings. Deep learning offers the potential to automate ultrasound-based assessment of RD. However, there are no ML ultrasound algorithms currently available for clinical use to detect RD and no prior research has been done on assessing macular status using ultrasound in RD cases -- an essential distinction for surgical prioritization. Moreover, no public dataset currently supports macular-based RD classification using ultrasound video clips. We introduce Eye Retinal DEtachment ultraSound, ERDES, the first open-access dataset of ocular ultrasound clips labeled for (i) presence of retinal detachment and (ii) macula-intact versus macula-detached status. The dataset is intended to facilitate the development and evaluation of machine learning models for detecting retinal detachment. We also provide baseline benchmarks using multiple spatiotemporal convolutional neural network (CNN) architectures. All clips, labels, and training code are publicly available at https://osupcvlab.github.io/ERDES/.

Ultrasound Classification Dataset Release In Silico Academic Lab Open Dataset Open Code

Multi-Center 3D CNN for Parkinson's disease diagnosis and prognosis using clinical and T1-weighted MRI data.

Basaia S, Sarasso E, Sciancalepore F, Balestrino R, Musicco S, Pisano S, Stankovic I, Tomic A, Micco R, Tessitore A, Salvi M, Meiburger KM, Kostic VS, Molinari F, Agosta F, Filippi M

•papers•Aug 5 2025

Parkinson's disease (PD) presents challenges in early diagnosis and progression prediction. Recent advancements in machine learning, particularly convolutional-neural-networks (CNNs), show promise in enhancing diagnostic accuracy and prognostic capabilities using neuroimaging data. The aims of this study were: (i) develop a 3D-CNN based on MRI to distinguish controls and PD patients and (ii) employ CNN to predict the progression of PD. Three cohorts were selected: 86 mild, 62 moderate-to-severe PD patients, and 60 controls; 14 mild-PD patients and 14 controls from Parkinson's Progression Markers Initiative database, and 38 de novo mild-PD patients and 38 controls. All participants underwent MRI scans and clinical evaluation at baseline and over 2-years. PD subjects were classified in two clusters of different progression using k-means clustering based on baseline and follow-up UDPRS-III scores. A 3D-CNN was built and tested on PD patients and controls, with binary classifications: controls vs moderate-to-severe PD, controls vs mild-PD, and two clusters of PD progression. The effect of transfer learning was also tested. CNN effectively differentiated moderate-to-severe PD from controls (74% accuracy) using MRI data alone. Transfer learning significantly improved performance in distinguishing mild-PD from controls (64% accuracy). For predicting disease progression, the model achieved over 70% accuracy by combining MRI and clinical data. Brain regions most influential in the CNN's decisions were visualized. CNN, integrating multimodal data and transfer learning, provides encouraging results toward early-stage classification and progression monitoring in PD. Its explainability through activation maps offers potential for clinical application in early diagnosis and personalized monitoring.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA

A Novel Multimodal Framework for Early Detection of Alzheimers Disease Using Deep Learning

Tatwadarshi P Nagarhalli, Sanket Patil, Vishal Pande, Uday Aswalekar, Prafulla Patil

•preprint•Aug 5 2025

Alzheimers Disease (AD) is a progressive neurodegenerative disorder that poses significant challenges in its early diagnosis, often leading to delayed treatment and poorer outcomes for patients. Traditional diagnostic methods, typically reliant on single data modalities, fall short of capturing the multifaceted nature of the disease. In this paper, we propose a novel multimodal framework for the early detection of AD that integrates data from three primary sources: MRI imaging, cognitive assessments, and biomarkers. This framework employs Convolutional Neural Networks (CNN) for analyzing MRI images and Long Short-Term Memory (LSTM) networks for processing cognitive and biomarker data. The system enhances diagnostic accuracy and reliability by aggregating results from these distinct modalities using advanced techniques like weighted averaging, even in incomplete data. The multimodal approach not only improves the robustness of the detection process but also enables the identification of AD at its earliest stages, offering a significant advantage over conventional methods. The integration of biomarkers and cognitive tests is particularly crucial, as these can detect Alzheimer's long before the onset of clinical symptoms, thereby facilitating earlier intervention and potentially altering the course of the disease. This research demonstrates that the proposed framework has the potential to revolutionize the early detection of AD, paving the way for more timely and effective treatments

MRI Detection Neurological Methodology In Silico

Towards a zero-shot low-latency navigation for open surgery augmented reality applications.

Schwimmbeck M, Khajarian S, Auer C, Wittenberg T, Remmele S

•papers•Aug 5 2025

Augmented reality (AR) enhances surgical navigation by superimposing visible anatomical structures with three-dimensional virtual models using head-mounted displays (HMDs). In particular, interventions such as open liver surgery can benefit from AR navigation, as it aids in identifying and distinguishing tumors and risk structures. However, there is a lack of automatic and markerless methods that are robust against real-world challenges, such as partial occlusion and organ motion. We introduce a novel multi-device approach for automatic live navigation in open liver surgery that enhances the visualization and interaction capabilities of a HoloLens 2 HMD through precise and reliable registration using an Intel RealSense RGB-D camera. The intraoperative RGB-D segmentation and the preoperative CT data are utilized to register a virtual liver model to the target anatomy. An AR-prompted Segment Anything Model (SAM) enables robust segmentation of the liver in situ without the need for additional training data. To mitigate algorithmic latency, Double Exponential Smoothing (DES) is applied to forecast registration results. We conducted a phantom study for open liver surgery, investigating various scenarios of liver motion, viewpoints, and occlusion. The mean registration errors (8.31 mm-18.78 mm TRE) are comparable to those reported in prior work, while our approach demonstrates high success rates even for high occlusion factors and strong motion. Using forecasting, we bypassed the algorithmic latency of 79.8 ms per frame, with median forecasting errors below 2 mms and 1.5 degrees between the quaternions. To our knowledge, this is the first work to approach markerless in situ visualization by combining a multi-device method with forecasting and a foundation model for segmentation and tracking. This enables a more reliable and precise AR registration of surgical targets with low latency. Our approach can be applied to other surgical applications and AR hardware with minimal effort.

CT Segmentation Abdominal Methodology Phantom/Animal GenAI

GRASPing Anatomy to Improve Pathology Segmentation

Keyi Li, Alexander Jaus, Jens Kleesiek, Rainer Stiefelhagen

•preprint•Aug 5 2025

Radiologists rely on anatomical understanding to accurately delineate pathologies, yet most current deep learning approaches use pure pattern recognition and ignore the anatomical context in which pathologies develop. To narrow this gap, we introduce GRASP (Guided Representation Alignment for the Segmentation of Pathologies), a modular plug-and-play framework that enhances pathology segmentation models by leveraging existing anatomy segmentation models through pseudolabel integration and feature alignment. Unlike previous approaches that obtain anatomical knowledge via auxiliary training, GRASP integrates into standard pathology optimization regimes without retraining anatomical components. We evaluate GRASP on two PET/CT datasets, conduct systematic ablation studies, and investigate the framework's inner workings. We find that GRASP consistently achieves top rankings across multiple evaluation metrics and diverse architectures. The framework's dual anatomy injection strategy, combining anatomical pseudo-labels as input channels with transformer-guided anatomical feature fusion, effectively incorporates anatomical context.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA

MedCAL-Bench: A Comprehensive Benchmark on Cold-Start Active Learning with Foundation Models for Medical Image Analysis

Ning Zhu, Xiaochuan Ma, Shaoting Zhang, Guotai Wang

•preprint•Aug 5 2025

Cold-Start Active Learning (CSAL) aims to select informative samples for annotation without prior knowledge, which is important for improving annotation efficiency and model performance under a limited annotation budget in medical image analysis. Most existing CSAL methods rely on Self-Supervised Learning (SSL) on the target dataset for feature extraction, which is inefficient and limited by insufficient feature representation. Recently, pre-trained Foundation Models (FMs) have shown powerful feature extraction ability with a potential for better CSAL. However, this paradigm has been rarely investigated, with a lack of benchmarks for comparison of FMs in CSAL tasks. To this end, we propose MedCAL-Bench, the first systematic FM-based CSAL benchmark for medical image analysis. We evaluate 14 FMs and 7 CSAL strategies across 7 datasets under different annotation budgets, covering classification and segmentation tasks from diverse medical modalities. It is also the first CSAL benchmark that evaluates both the feature extraction and sample selection stages. Our experimental results reveal that: 1) Most FMs are effective feature extractors for CSAL, with DINO family performing the best in segmentation; 2) The performance differences of these FMs are large in segmentation tasks, while small for classification; 3) Different sample selection strategies should be considered in CSAL on different datasets, with Active Learning by Processing Surprisal (ALPS) performing the best in segmentation while RepDiv leading for classification. The code is available at https://github.com/HiLab-git/MedCAL-Bench.

Mixed Modality Segmentation Dataset Release In Silico Academic Lab Open Code Benchmark SOTA

Unsupervised learning based perfusion maps for temporally truncated CT perfusion imaging.

Tung CH, Li ZY, Huang HM

•papers•Aug 5 2025

Computed tomography perfusion (CTP) imaging is a rapid diagnostic tool for acute stroke but is less robust when tissue time-attenuation curves are truncated. This study proposes an unsupervised learning method for generating perfusion maps from truncated CTP images. Real brain CTP images were artificially truncated to 15% and 30% of the original scan time. Perfusion maps of complete and truncated CTP images were calculated using the proposed method and compared with standard singular value decomposition (SVD), tensor total variation (TTV), nonlinear regression (NLR), and spatio-temporal perfusion physics-informed neural network (SPPINN).Main results.The NLR method yielded many perfusion values outside physiological ranges, indicating a lack of robustness. The proposed method did not improve the estimation of cerebral blood flow compared to both the SVD and TTV methods, but reduced the effect of truncation on the estimation of cerebral blood volume, with a relative difference of 15.4% in the infarcted region for 30% truncation (20.7% for SVD and 19.4% for TTV). The proposed method also showed better resistance to 30% truncation for mean transit time, with a relative difference of 16.6% in the infarcted region (25.9% for SVD and 26.2% for TTV). Compared to the SPPINN method, the proposed method had similar responses to truncation in gray and white matter, but was less sensitive to truncation in the infarcted region. These results demonstrate the feasibility of using unsupervised learning to generate perfusion maps from CTP images and improve robustness under truncation scenarios.&#xD.

CT Image Synthesis Neurological Methodology In Silico

Filter Papers

Tags

A novel lung cancer diagnosis model using hybrid convolution (2D/3D)-based adaptive DenseUnet with attention mechanism.

Real-time 3D US-CT fusion-based semi-automatic puncture robot system: clinical evaluation.

Are Vision-xLSTM-embedded U-Nets better at segmenting medical images?

ERDES: A Benchmark Video Dataset for Retinal Detachment and Macular Status Classification in Ocular Ultrasound

Multi-Center 3D CNN for Parkinson's disease diagnosis and prognosis using clinical and T1-weighted MRI data.

A Novel Multimodal Framework for Early Detection of Alzheimers Disease Using Deep Learning

Towards a zero-shot low-latency navigation for open surgery augmented reality applications.

GRASPing Anatomy to Improve Pathology Segmentation

MedCAL-Bench: A Comprehensive Benchmark on Cold-Start Active Learning with Foundation Models for Medical Image Analysis

Unsupervised learning based perfusion maps for temporally truncated CT perfusion imaging.

Ready to Sharpen Your Edge?