Sort by:
Page 3 of 19185 results

Unsupervised learning based perfusion maps for temporally truncated CT perfusion imaging.

Tung CH, Li ZY, Huang HM

pubmed logopapersAug 5 2025

Computed tomography perfusion (CTP) imaging is a rapid diagnostic tool for acute stroke but is less robust when tissue time-attenuation curves are truncated. This study proposes an unsupervised learning method for generating perfusion maps from truncated CTP images. Real brain CTP images were artificially truncated to 15% and 30% of the original scan time. Perfusion maps of complete and truncated CTP images were calculated using the proposed method and compared with standard singular value decomposition (SVD), tensor total variation (TTV), nonlinear regression (NLR), and spatio-temporal perfusion physics-informed neural network (SPPINN).
Main results.
The NLR method yielded many perfusion values outside physiological ranges, indicating a lack of robustness. The proposed method did not improve the estimation of cerebral blood flow compared to both the SVD and TTV methods, but reduced the effect of truncation on the estimation of cerebral blood volume, with a relative difference of 15.4% in the infarcted region for 30% truncation (20.7% for SVD and 19.4% for TTV). The proposed method also showed better resistance to 30% truncation for mean transit time, with a relative difference of 16.6% in the infarcted region (25.9% for SVD and 26.2% for TTV). Compared to the SPPINN method, the proposed method had similar responses to truncation in gray and white matter, but was less sensitive to truncation in the infarcted region. These results demonstrate the feasibility of using unsupervised learning to generate perfusion maps from CTP images and improve robustness under truncation scenarios.&#xD.

ClinicalFMamba: Advancing Clinical Assessment using Mamba-based Multimodal Neuroimaging Fusion

Meng Zhou, Farzad Khalvati

arxiv logopreprintAug 5 2025
Multimodal medical image fusion integrates complementary information from different imaging modalities to enhance diagnostic accuracy and treatment planning. While deep learning methods have advanced performance, existing approaches face critical limitations: Convolutional Neural Networks (CNNs) excel at local feature extraction but struggle to model global context effectively, while Transformers achieve superior long-range modeling at the cost of quadratic computational complexity, limiting clinical deployment. Recent State Space Models (SSMs) offer a promising alternative, enabling efficient long-range dependency modeling in linear time through selective scan mechanisms. Despite these advances, the extension to 3D volumetric data and the clinical validation of fused images remains underexplored. In this work, we propose ClinicalFMamba, a novel end-to-end CNN-Mamba hybrid architecture that synergistically combines local and global feature modeling for 2D and 3D images. We further design a tri-plane scanning strategy for effectively learning volumetric dependencies in 3D images. Comprehensive evaluations on three datasets demonstrate the superior fusion performance across multiple quantitative metrics while achieving real-time fusion. We further validate the clinical utility of our approach on downstream 2D/3D brain tumor classification tasks, achieving superior performance over baseline methods. Our method establishes a new paradigm for efficient multimodal medical image fusion suitable for real-time clinical deployment.

GL-LCM: Global-Local Latent Consistency Models for Fast High-Resolution Bone Suppression in Chest X-Ray Images

Yifei Sun, Zhanghao Chen, Hao Zheng, Yuqing Lu, Lixin Duan, Fenglei Fan, Ahmed Elazab, Xiang Wan, Changmiao Wang, Ruiquan Ge

arxiv logopreprintAug 5 2025
Chest X-Ray (CXR) imaging for pulmonary diagnosis raises significant challenges, primarily because bone structures can obscure critical details necessary for accurate diagnosis. Recent advances in deep learning, particularly with diffusion models, offer significant promise for effectively minimizing the visibility of bone structures in CXR images, thereby improving clarity and diagnostic accuracy. Nevertheless, existing diffusion-based methods for bone suppression in CXR imaging struggle to balance the complete suppression of bones with preserving local texture details. Additionally, their high computational demand and extended processing time hinder their practical use in clinical settings. To address these limitations, we introduce a Global-Local Latent Consistency Model (GL-LCM) architecture. This model combines lung segmentation, dual-path sampling, and global-local fusion, enabling fast high-resolution bone suppression in CXR images. To tackle potential boundary artifacts and detail blurring in local-path sampling, we further propose Local-Enhanced Guidance, which addresses these issues without additional training. Comprehensive experiments on a self-collected dataset SZCH-X-Rays, and the public dataset JSRT, reveal that our GL-LCM delivers superior bone suppression and remarkable computational efficiency, significantly outperforming several competitive methods. Our code is available at https://github.com/diaoquesang/GL-LCM.

Controllable Mask Diffusion Model for medical annotation synthesis with semantic information extraction.

Heo C, Jung J

pubmed logopapersAug 5 2025
Medical segmentation, a prominent task in medical image analysis utilizing artificial intelligence, plays a crucial role in computer-aided diagnosis and depends heavily on the quality of the training data. However, the availability of sufficient data is constrained by strict privacy regulations associated with medical data. To mitigate this issue, research on data augmentation has gained significant attention. Medical segmentation tasks require paired datasets consisting of medical images and annotation images, also known as mask images, which represent lesion areas or radiological information within the medical images. Consequently, it is essential to apply data augmentation to both image types. This study proposes a Controllable Mask Diffusion Model, a novel approach capable of controlling and generating new masks. This model leverages the binary structure of the mask to extract semantic information, namely, the mask's size, location, and count, which is then applied as multi-conditional input to a diffusion model via a regressor. Through the regressor, newly generated masks conform to the input semantic information, thereby enabling input-driven controllable generation. Additionally, a technique that analyzes correlation within semantic information was devised for large-scale data synthesis. The generative capacity of the proposed model was evaluated against real datasets, and the model's ability to control and generate new masks based on previously unseen semantic information was confirmed. Furthermore, the practical applicability of the model was demonstrated by augmenting the data with the generated data, applying it to segmentation tasks, and comparing the performance with and without augmentation. Additionally, experiments were conducted on single-label and multi-label masks, yielding superior results for both types. This demonstrates the potential applicability of this study to various areas within the medical field.

Deep Learning-Based Signal Amplification of T1-Weighted Single-Dose Images Improves Metastasis Detection in Brain MRI.

Haase R, Pinetz T, Kobler E, Bendella Z, Zülow S, Schievelkamp AH, Schmeel FC, Panahabadi S, Stylianou AM, Paech D, Foltyn-Dumitru M, Wagner V, Schlamp K, Heussel G, Holtkamp M, Heussel CP, Vahlensieck M, Luetkens JA, Schlemmer HP, Haubold J, Radbruch A, Effland A, Deuschl C, Deike K

pubmed logopapersAug 1 2025
Double-dose contrast-enhanced brain imaging improves tumor delineation and detection of occult metastases but is limited by concerns about gadolinium-based contrast agents' effects on patients and the environment. The purpose of this study was to test the benefit of a deep learning-based contrast signal amplification in true single-dose T1-weighted (T-SD) images creating artificial double-dose (A-DD) images for metastasis detection in brain magnetic resonance imaging. In this prospective, multicenter study, a deep learning-based method originally trained on noncontrast, low-dose, and T-SD brain images was applied to T-SD images of 30 participants (mean age ± SD, 58.5 ± 11.8 years; 23 women) acquired externally between November 2022 and June 2023. Four readers with different levels of experience independently reviewed T-SD and A-DD images for metastases with 4 weeks between readings. A reference reader reviewed additionally acquired true double-dose images to determine any metastases present. Performances were compared using Mid-p McNemar tests for sensitivity and Wilcoxon signed rank tests for false-positive findings. All readers found more metastases using A-DD images. The 2 experienced neuroradiologists achieved the same level of sensitivity using T-SD images (62 of 91 metastases, 68.1%). While the increase in sensitivity using A-DD images was only descriptive for 1 of them (A-DD: 65 of 91 metastases, +3.3%, P = 0.424), the second neuroradiologist benefited significantly with a sensitivity increase of 12.1% (73 of 91 metastases, P = 0.008). The 2 less experienced readers (1 resident and 1 fellow) both found significantly more metastases on A-DD images (resident, T-SD: 61.5%, A-DD: 68.1%, P = 0.039; fellow, T-SD: 58.2%, A-DD: 70.3%, P = 0.008). They were therefore able to use A-DD images to increase their sensitivity to the neuroradiologists' initial level on regular T-SD images. False-positive findings did not differ significantly between sequences. However, readers showed descriptively more false-positive findings on A-DD images. The benefit in sensitivity particularly applied to metastases ≤5 mm (5.7%-17.3% increase in sensitivity). A-DD images can improve the detectability of brain metastases without a significant loss of precision and could therefore represent a potentially valuable addition to regular single-dose brain imaging.

Adaptively Distilled ControlNet: Accelerated Training and Superior Sampling for Medical Image Synthesis

Kunpeng Qiu, Zhiying Zhou, Yongxin Guo

arxiv logopreprintJul 31 2025
Medical image annotation is constrained by privacy concerns and labor-intensive labeling, significantly limiting the performance and generalization of segmentation models. While mask-controllable diffusion models excel in synthesis, they struggle with precise lesion-mask alignment. We propose \textbf{Adaptively Distilled ControlNet}, a task-agnostic framework that accelerates training and optimization through dual-model distillation. Specifically, during training, a teacher model, conditioned on mask-image pairs, regularizes a mask-only student model via predicted noise alignment in parameter space, further enhanced by adaptive regularization based on lesion-background ratios. During sampling, only the student model is used, enabling privacy-preserving medical image generation. Comprehensive evaluations on two distinct medical datasets demonstrate state-of-the-art performance: TransUNet improves mDice/mIoU by 2.4%/4.2% on KiTS19, while SANet achieves 2.6%/3.5% gains on Polyps, highlighting its effectiveness and superiority. Code is available at GitHub.

Reference-Guided Diffusion Inpainting For Multimodal Counterfactual Generation

Alexandru Buburuzan

arxiv logopreprintJul 30 2025
Safety-critical applications, such as autonomous driving and medical image analysis, require extensive multimodal data for rigorous testing. Synthetic data methods are gaining prominence due to the cost and complexity of gathering real-world data, but they demand a high degree of realism and controllability to be useful. This work introduces two novel methods for synthetic data generation in autonomous driving and medical image analysis, namely MObI and AnydoorMed, respectively. MObI is a first-of-its-kind framework for Multimodal Object Inpainting that leverages a diffusion model to produce realistic and controllable object inpaintings across perceptual modalities, demonstrated simultaneously for camera and lidar. Given a single reference RGB image, MObI enables seamless object insertion into existing multimodal scenes at a specified 3D location, guided by a bounding box, while maintaining semantic consistency and multimodal coherence. Unlike traditional inpainting methods that rely solely on edit masks, this approach uses 3D bounding box conditioning to ensure accurate spatial positioning and realistic scaling. AnydoorMed extends this paradigm to the medical imaging domain, focusing on reference-guided inpainting for mammography scans. It leverages a diffusion-based model to inpaint anomalies with impressive detail preservation, maintaining the reference anomaly's structural integrity while semantically blending it with the surrounding tissue. Together, these methods demonstrate that foundation models for reference-guided inpainting in natural images can be readily adapted to diverse perceptual modalities, paving the way for the next generation of systems capable of constructing highly realistic, controllable and multimodal counterfactual scenarios.

Neural Autoregressive Modeling of Brain Aging

Ridvan Yesiloglu, Wei Peng, Md Tauhidul Islam, Ehsan Adeli

arxiv logopreprintJul 29 2025
Brain aging synthesis is a critical task with broad applications in clinical and computational neuroscience. The ability to predict the future structural evolution of a subject's brain from an earlier MRI scan provides valuable insights into aging trajectories. Yet, the high-dimensionality of data, subtle changes of structure across ages, and subject-specific patterns constitute challenges in the synthesis of the aging brain. To overcome these challenges, we propose NeuroAR, a novel brain aging simulation model based on generative autoregressive transformers. NeuroAR synthesizes the aging brain by autoregressively estimating the discrete token maps of a future scan from a convenient space of concatenated token embeddings of a previous and future scan. To guide the generation, it concatenates into each scale the subject's previous scan, and uses its acquisition age and the target age at each block via cross-attention. We evaluate our approach on both the elderly population and adolescent subjects, demonstrating superior performance over state-of-the-art generative models, including latent diffusion models (LDM) and generative adversarial networks, in terms of image fidelity. Furthermore, we employ a pre-trained age predictor to further validate the consistency and realism of the synthesized images with respect to expected aging patterns. NeuroAR significantly outperforms key models, including LDM, demonstrating its ability to model subject-specific brain aging trajectories with high fidelity.

Time-series X-ray image prediction of dental skeleton treatment progress via neural networks.

Kwon SW, Moon JK, Song SC, Cha JY, Kim YW, Choi YJ, Lee JS

pubmed logopapersJul 29 2025
Accurate prediction of skeletal changes during orthodontic treatment in growing patients remains challenging due to significant individual variability in craniofacial growth and treatment responses. Conventional methods, such as support vector regression and multilayer perceptrons, require multiple sequential radiographs to achieve acceptable accuracy. However, they are limited by increased radiation exposure, susceptibility to landmark identification errors, and the lack of visually interpretable predictions. To overcome these limitations, this study explored advanced generative approaches, including denoising diffusion probabilistic models (DDPMs), latent diffusion models (LDMs), and ControlNet, to predict future cephalometric radiographs using minimal input data. We evaluated three diffusion-based models-a DDPM utilizing three sequential cephalometric images (3-input DDPM), a single-image DDPM (1-input DDPM), and a single-image LDM-and a vision-based generative model, ControlNet, conditioned on patient-specific attributes such as age, sex, and orthodontic treatment type. Quantitative evaluations demonstrated that the 3-input DDPM achieved the highest numerical accuracy, whereas the single-image LDM delivered comparable predictive performance with significantly reduced clinical requirements. ControlNet also exhibited competitive accuracy, highlighting its potential effectiveness in clinical scenarios. These findings indicate that the single-image LDM and ControlNet offer practical solutions for personalized orthodontic treatment planning, reducing patient visits and radiation exposure while maintaining robust predictive accuracy.

Enhancing Synthetic Pelvic CT Generation from CBCT using Vision Transformer with Adaptive Fourier Neural Operators.

Bhaskara R, Oderinde OM

pubmed logopapersJul 28 2025
This study introduces a novel approach to improve Cone Beam CT (CBCT) image quality by developing a synthetic CT (sCT) generation method using CycleGAN with a Vision Transformer (ViT) and an Adaptive Fourier Neural Operator (AFNO). 

Approach: A dataset of 20 prostate cancer patients who received stereotactic body radiation therapy (SBRT) was used, consisting of paired CBCT and planning CT (pCT) images. The dataset was preprocessed by registering pCTs to CBCTs using deformation registration techniques, such as B-spline, followed by resampling to uniform voxel sizes and normalization. The model architecture integrates a CycleGAN with bidirectional generators, where the UNet generator is enhanced with a ViT at the bottleneck. AFNO functions as the attention mechanism for the ViT, operating on the input data in the Fourier domain. AFNO's innovations handle varying resolutions, mesh invariance, and efficient long-range dependency capture.

Main Results: Our model improved significantly in preserving anatomical details and capturing complex image dependencies. The AFNO mechanism processed global image information effectively, adapting to interpatient variations for accurate sCT generation. Evaluation metrics like Mean Absolute Error (MAE), Peak Signal to Noise Ratio (PSNR), Structural Similarity Index (SSIM), and Normalized Cross Correlation (NCC), demonstrated the superiority of our method. Specifically, the model achieved an MAE of 9.71, PSNR of 37.08 dB, SSIM of 0.97, and NCC of 0.99, confirming its efficacy. 

Significance: The integration of AFNO within the CycleGAN UNet framework addresses Cone Beam CT image quality limitations. The model generates synthetic CTs that allow adaptive treatment planning during SBRT, enabling adjustments to the dose based on tumor response, thus reducing radiotoxicity from increased doses. This method's ability to preserve both global and local anatomical features shows potential for improving tumor targeting, adaptive radiotherapy planning, and clinical decision-making.
Page 3 of 19185 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.