Sort by:
Page 302 of 6626611 results

Keyi Li, Alexander Jaus, Jens Kleesiek, Rainer Stiefelhagen

arxiv logopreprintAug 5 2025
Radiologists rely on anatomical understanding to accurately delineate pathologies, yet most current deep learning approaches use pure pattern recognition and ignore the anatomical context in which pathologies develop. To narrow this gap, we introduce GRASP (Guided Representation Alignment for the Segmentation of Pathologies), a modular plug-and-play framework that enhances pathology segmentation models by leveraging existing anatomy segmentation models through pseudolabel integration and feature alignment. Unlike previous approaches that obtain anatomical knowledge via auxiliary training, GRASP integrates into standard pathology optimization regimes without retraining anatomical components. We evaluate GRASP on two PET/CT datasets, conduct systematic ablation studies, and investigate the framework's inner workings. We find that GRASP consistently achieves top rankings across multiple evaluation metrics and diverse architectures. The framework's dual anatomy injection strategy, combining anatomical pseudo-labels as input channels with transformer-guided anatomical feature fusion, effectively incorporates anatomical context.

Tongxu Zhang, Zhiming Liang, Bei Wang

arxiv logopreprintAug 5 2025
Point clouds have become an increasingly important representation for 3D medical imaging, offering a compact, surface-preserving alternative to traditional voxel or mesh-based approaches. Recent advances in deep learning have enabled rapid progress in extracting, modeling, and analyzing anatomical shapes directly from point cloud data. This paper provides a comprehensive and systematic survey of learning-based shape analysis for medical point clouds, focusing on three fundamental tasks: registration, reconstruction, and variation modeling. We review recent literature from 2021 to 2025, summarize representative methods, datasets, and evaluation metrics, and highlight clinical applications and unique challenges in the medical domain. Key trends include the integration of hybrid representations, large-scale self-supervised models, and generative techniques. We also discuss current limitations, such as data scarcity, inter-patient variability, and the need for interpretable and robust solutions for clinical deployment. Finally, future directions are outlined for advancing point cloud-based shape learning in medical imaging.

Meng Zhou, Farzad Khalvati

arxiv logopreprintAug 5 2025
Multimodal medical image fusion integrates complementary information from different imaging modalities to enhance diagnostic accuracy and treatment planning. While deep learning methods have advanced performance, existing approaches face critical limitations: Convolutional Neural Networks (CNNs) excel at local feature extraction but struggle to model global context effectively, while Transformers achieve superior long-range modeling at the cost of quadratic computational complexity, limiting clinical deployment. Recent State Space Models (SSMs) offer a promising alternative, enabling efficient long-range dependency modeling in linear time through selective scan mechanisms. Despite these advances, the extension to 3D volumetric data and the clinical validation of fused images remains underexplored. In this work, we propose ClinicalFMamba, a novel end-to-end CNN-Mamba hybrid architecture that synergistically combines local and global feature modeling for 2D and 3D images. We further design a tri-plane scanning strategy for effectively learning volumetric dependencies in 3D images. Comprehensive evaluations on three datasets demonstrate the superior fusion performance across multiple quantitative metrics while achieving real-time fusion. We further validate the clinical utility of our approach on downstream 2D/3D brain tumor classification tasks, achieving superior performance over baseline methods. Our method establishes a new paradigm for efficient multimodal medical image fusion suitable for real-time clinical deployment.

Yifei Sun, Zhanghao Chen, Hao Zheng, Yuqing Lu, Lixin Duan, Fenglei Fan, Ahmed Elazab, Xiang Wan, Changmiao Wang, Ruiquan Ge

arxiv logopreprintAug 5 2025
Chest X-Ray (CXR) imaging for pulmonary diagnosis raises significant challenges, primarily because bone structures can obscure critical details necessary for accurate diagnosis. Recent advances in deep learning, particularly with diffusion models, offer significant promise for effectively minimizing the visibility of bone structures in CXR images, thereby improving clarity and diagnostic accuracy. Nevertheless, existing diffusion-based methods for bone suppression in CXR imaging struggle to balance the complete suppression of bones with preserving local texture details. Additionally, their high computational demand and extended processing time hinder their practical use in clinical settings. To address these limitations, we introduce a Global-Local Latent Consistency Model (GL-LCM) architecture. This model combines lung segmentation, dual-path sampling, and global-local fusion, enabling fast high-resolution bone suppression in CXR images. To tackle potential boundary artifacts and detail blurring in local-path sampling, we further propose Local-Enhanced Guidance, which addresses these issues without additional training. Comprehensive experiments on a self-collected dataset SZCH-X-Rays, and the public dataset JSRT, reveal that our GL-LCM delivers superior bone suppression and remarkable computational efficiency, significantly outperforming several competitive methods. Our code is available at https://github.com/diaoquesang/GL-LCM.

Jiantao Tan, Peixian Ma, Kanghao Chen, Zhiming Dai, Ruixuan Wang

arxiv logopreprintAug 5 2025
Continual learning is essential for medical image classification systems to adapt to dynamically evolving clinical environments. The integration of multimodal information can significantly enhance continual learning of image classes. However, while existing approaches do utilize textual modality information, they solely rely on simplistic templates with a class name, thereby neglecting richer semantic information. To address these limitations, we propose a novel framework that harnesses visual concepts generated by large language models (LLMs) as discriminative semantic guidance. Our method dynamically constructs a visual concept pool with a similarity-based filtering mechanism to prevent redundancy. Then, to integrate the concepts into the continual learning process, we employ a cross-modal image-concept attention module, coupled with an attention loss. Through attention, the module can leverage the semantic knowledge from relevant visual concepts and produce class-representative fused features for classification. Experiments on medical and natural image datasets show our method achieves state-of-the-art performance, demonstrating the effectiveness and superiority of our method. We will release the code publicly.

Tatwadarshi P Nagarhalli, Sanket Patil, Vishal Pande, Uday Aswalekar, Prafulla Patil

arxiv logopreprintAug 5 2025
Alzheimers Disease (AD) is a progressive neurodegenerative disorder that poses significant challenges in its early diagnosis, often leading to delayed treatment and poorer outcomes for patients. Traditional diagnostic methods, typically reliant on single data modalities, fall short of capturing the multifaceted nature of the disease. In this paper, we propose a novel multimodal framework for the early detection of AD that integrates data from three primary sources: MRI imaging, cognitive assessments, and biomarkers. This framework employs Convolutional Neural Networks (CNN) for analyzing MRI images and Long Short-Term Memory (LSTM) networks for processing cognitive and biomarker data. The system enhances diagnostic accuracy and reliability by aggregating results from these distinct modalities using advanced techniques like weighted averaging, even in incomplete data. The multimodal approach not only improves the robustness of the detection process but also enables the identification of AD at its earliest stages, offering a significant advantage over conventional methods. The integration of biomarkers and cognitive tests is particularly crucial, as these can detect Alzheimer's long before the onset of clinical symptoms, thereby facilitating earlier intervention and potentially altering the course of the disease. This research demonstrates that the proposed framework has the potential to revolutionize the early detection of AD, paving the way for more timely and effective treatments

Schwimmbeck M, Khajarian S, Auer C, Wittenberg T, Remmele S

pubmed logopapersAug 5 2025
Augmented reality (AR) enhances surgical navigation by superimposing visible anatomical structures with three-dimensional virtual models using head-mounted displays (HMDs). In particular, interventions such as open liver surgery can benefit from AR navigation, as it aids in identifying and distinguishing tumors and risk structures. However, there is a lack of automatic and markerless methods that are robust against real-world challenges, such as partial occlusion and organ motion. We introduce a novel multi-device approach for automatic live navigation in open liver surgery that enhances the visualization and interaction capabilities of a HoloLens 2 HMD through precise and reliable registration using an Intel RealSense RGB-D camera. The intraoperative RGB-D segmentation and the preoperative CT data are utilized to register a virtual liver model to the target anatomy. An AR-prompted Segment Anything Model (SAM) enables robust segmentation of the liver in situ without the need for additional training data. To mitigate algorithmic latency, Double Exponential Smoothing (DES) is applied to forecast registration results. We conducted a phantom study for open liver surgery, investigating various scenarios of liver motion, viewpoints, and occlusion. The mean registration errors (8.31 mm-18.78 mm TRE) are comparable to those reported in prior work, while our approach demonstrates high success rates even for high occlusion factors and strong motion. Using forecasting, we bypassed the algorithmic latency of 79.8 ms per frame, with median forecasting errors below 2 mms and 1.5 degrees between the quaternions. To our knowledge, this is the first work to approach markerless in situ visualization by combining a multi-device method with forecasting and a foundation model for segmentation and tracking. This enables a more reliable and precise AR registration of surgical targets with low latency. Our approach can be applied to other surgical applications and AR hardware with minimal effort.

Tung CH, Li ZY, Huang HM

pubmed logopapersAug 5 2025

Computed tomography perfusion (CTP) imaging is a rapid diagnostic tool for acute stroke but is less robust when tissue time-attenuation curves are truncated. This study proposes an unsupervised learning method for generating perfusion maps from truncated CTP images. Real brain CTP images were artificially truncated to 15% and 30% of the original scan time. Perfusion maps of complete and truncated CTP images were calculated using the proposed method and compared with standard singular value decomposition (SVD), tensor total variation (TTV), nonlinear regression (NLR), and spatio-temporal perfusion physics-informed neural network (SPPINN).
Main results.
The NLR method yielded many perfusion values outside physiological ranges, indicating a lack of robustness. The proposed method did not improve the estimation of cerebral blood flow compared to both the SVD and TTV methods, but reduced the effect of truncation on the estimation of cerebral blood volume, with a relative difference of 15.4% in the infarcted region for 30% truncation (20.7% for SVD and 19.4% for TTV). The proposed method also showed better resistance to 30% truncation for mean transit time, with a relative difference of 16.6% in the infarcted region (25.9% for SVD and 26.2% for TTV). Compared to the SPPINN method, the proposed method had similar responses to truncation in gray and white matter, but was less sensitive to truncation in the infarcted region. These results demonstrate the feasibility of using unsupervised learning to generate perfusion maps from CTP images and improve robustness under truncation scenarios.&#xD.

Frederiksen BA, Hammer HB, Terslev L, Ammitzbøll-Danielsen M, Savarimuthu TR, Weber ABH, Just SA

pubmed logopapersAug 5 2025
To evaluate the agreement and repeatability of an automated robotic ultrasound system (ARTHUR V.2.0) combined with an AI model (DIANA V.2.0) in assessing synovial hypertrophy (SH) and Doppler activity in rheumatoid arthritis (RA) patients, using an expert rheumatologist's assessment as the reference standard. 30 RA patients underwent two consecutive ARTHUR V.2.0 scans and rheumatologist assessment of 22 hand joints, with the rheumatologist blinded to the automated system's results. Images were scored for SH and Doppler by DIANA V.2.0 using the EULAR-OMERACT scale (0-3). The agreement was evaluated by weighted Cohen's kappa, percent exact agreement (PEA), percent close agreement (PCA) and binary outcomes using Global OMERACT-EULAR Synovitis Scoring (healthy ≤1 vs diseased ≥2). Comparisons included intra-robot repeatability and agreement with the expert rheumatologist and a blinded independent assessor. ARTHUR successfully scanned 564 out of 660 joints, corresponding to an overall success rate of 85.5%. Intra-robot agreement for SH: PEA 63.0%, PCA 93.0%, binary 90.5% and for Doppler, PEA 74.8%, PCA 93.7%, binary 88.1% and kappa values of 0.54 and 0.49. Agreement between ARTHUR+DIANA and the rheumatologist: SH (PEA 57.9%, PCA 92.9%, binary 87.3%, kappa 0.38); Doppler (PEA 77.3%, PCA 94.2%, binary 91.2%, kappa 0.44) and with the independent assessor: SH (PEA 49.0%, PCA 91.2%, binary 80.0%, kappa 0.39); Doppler (PEA 62.6%, PCA 94.4%, binary 88.1%, kappa 0.48). ARTHUR V.2.0 and DIANA V.2.0 demonstrated repeatability on par with intra-expert agreement reported in the literature and showed encouraging agreement with human assessors, though further refinement is needed to optimise performance across specific joints.

Xiangcen Wu, Shaheer U. Saeed, Yipei Wang, Ester Bonmati Coll, Yipeng Hu

arxiv logopreprintAug 5 2025
Radiologists often mix medical image reading strategies, including inspection of individual modalities and local image regions, using information at different locations from different images independently as well as concurrently. In this paper, we propose a recommend system to assist machine learning-based segmentation models, by suggesting appropriate image portions along with the best modality, such that prostate cancer segmentation performance can be maximised. Our approach trains a policy network that assists tumor localisation, by recommending both the optimal imaging modality and the specific sections of interest for review. During training, a pre-trained segmentation network mimics radiologist inspection on individual or variable combinations of these imaging modalities and their sections - selected by the policy network. Taking the locally segmented regions as an input for the next step, this dynamic decision making process iterates until all cancers are best localised. We validate our method using a data set of 1325 labelled multiparametric MRI images from prostate cancer patients, demonstrating its potential to improve annotation efficiency and segmentation accuracy, especially when challenging pathology is present. Experimental results show that our approach can surpass standard segmentation networks. Perhaps more interestingly, our trained agent independently developed its own optimal strategy, which may or may not be consistent with current radiologist guidelines such as PI-RADS. This observation also suggests a promising interactive application, in which the proposed policy networks assist human radiologists.
Page 302 of 6626611 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.