Sort by:
Page 6 of 22215 results

DM-FNet: Unified multimodal medical image fusion via diffusion process-trained encoder-decoder

Dan He, Weisheng Li, Guofen Wang, Yuping Huang, Shiqiang Liu

arxiv logopreprintJun 18 2025
Multimodal medical image fusion (MMIF) extracts the most meaningful information from multiple source images, enabling a more comprehensive and accurate diagnosis. Achieving high-quality fusion results requires a careful balance of brightness, color, contrast, and detail; this ensures that the fused images effectively display relevant anatomical structures and reflect the functional status of the tissues. However, existing MMIF methods have limited capacity to capture detailed features during conventional training and suffer from insufficient cross-modal feature interaction, leading to suboptimal fused image quality. To address these issues, this study proposes a two-stage diffusion model-based fusion network (DM-FNet) to achieve unified MMIF. In Stage I, a diffusion process trains UNet for image reconstruction. UNet captures detailed information through progressive denoising and represents multilevel data, providing a rich set of feature representations for the subsequent fusion network. In Stage II, noisy images at various steps are input into the fusion network to enhance the model's feature recognition capability. Three key fusion modules are also integrated to process medical images from different modalities adaptively. Ultimately, the robust network structure and a hybrid loss function are integrated to harmonize the fused image's brightness, color, contrast, and detail, enhancing its quality and information density. The experimental results across various medical image types demonstrate that the proposed method performs exceptionally well regarding objective evaluation metrics. The fused image preserves appropriate brightness, a comprehensive distribution of radioactive tracers, rich textures, and clear edges. The code is available at https://github.com/HeDan-11/DM-FNet.

EchoFM: Foundation Model for Generalizable Echocardiogram Analysis.

Kim S, Jin P, Song S, Chen C, Li Y, Ren H, Li X, Liu T, Li Q

pubmed logopapersJun 18 2025
Echocardiography is the first-line noninvasive cardiac imaging modality, providing rich spatio-temporal information on cardiac anatomy and physiology. Recently, foundation model trained on extensive and diverse datasets has shown strong performance in various downstream tasks. However, translating foundation models into the medical imaging domain remains challenging due to domain differences between medical and natural images, the lack of diverse patient and disease datasets. In this paper, we introduce EchoFM, a general-purpose vision foundation model for echocardiography trained on a large-scale dataset of over 20 million echocardiographic images from 6,500 patients. To enable effective learning of rich spatio-temporal representations from periodic videos, we propose a novel self-supervised learning framework based on a masked autoencoder with a spatio-temporal consistent masking strategy and periodic-driven contrastive learning. The learned cardiac representations can be readily adapted and fine-tuned for a wide range of downstream tasks, serving as a strong and flexible backbone model. We validate EchoFM through experiments across key downstream tasks in the clinical echocardiography workflow, leveraging public and multi-center internal datasets. EchoFM consistently outperforms SOTA methods, demonstrating superior generalization capabilities and flexibility. The code and checkpoints are available at: https://github.com/SekeunKim/EchoFM.git.

Diffusion-based Counterfactual Augmentation: Towards Robust and Interpretable Knee Osteoarthritis Grading

Zhe Wang, Yuhua Ru, Aladine Chetouani, Tina Shiang, Fang Chen, Fabian Bauer, Liping Zhang, Didier Hans, Rachid Jennane, William Ewing Palmer, Mohamed Jarraya, Yung Hsin Chen

arxiv logopreprintJun 18 2025
Automated grading of Knee Osteoarthritis (KOA) from radiographs is challenged by significant inter-observer variability and the limited robustness of deep learning models, particularly near critical decision boundaries. To address these limitations, this paper proposes a novel framework, Diffusion-based Counterfactual Augmentation (DCA), which enhances model robustness and interpretability by generating targeted counterfactual examples. The method navigates the latent space of a diffusion model using a Stochastic Differential Equation (SDE), governed by balancing a classifier-informed boundary drive with a manifold constraint. The resulting counterfactuals are then used within a self-corrective learning strategy to improve the classifier by focusing on its specific areas of uncertainty. Extensive experiments on the public Osteoarthritis Initiative (OAI) and Multicenter Osteoarthritis Study (MOST) datasets demonstrate that this approach significantly improves classification accuracy across multiple model architectures. Furthermore, the method provides interpretability by visualizing minimal pathological changes and revealing that the learned latent space topology aligns with clinical knowledge of KOA progression. The DCA framework effectively converts model uncertainty into a robust training signal, offering a promising pathway to developing more accurate and trustworthy automated diagnostic systems. Our code is available at https://github.com/ZWang78/DCA.

Pediatric Pancreas Segmentation from MRI Scans with Deep Learning

Elif Keles, Merve Yazol, Gorkem Durak, Ziliang Hong, Halil Ertugrul Aktas, Zheyuan Zhang, Linkai Peng, Onkar Susladkar, Necati Guzelyel, Oznur Leman Boyunaga, Cemal Yazici, Mark Lowe, Aliye Uc, Ulas Bagci

arxiv logopreprintJun 18 2025
Objective: Our study aimed to evaluate and validate PanSegNet, a deep learning (DL) algorithm for pediatric pancreas segmentation on MRI in children with acute pancreatitis (AP), chronic pancreatitis (CP), and healthy controls. Methods: With IRB approval, we retrospectively collected 84 MRI scans (1.5T/3T Siemens Aera/Verio) from children aged 2-19 years at Gazi University (2015-2024). The dataset includes healthy children as well as patients diagnosed with AP or CP based on clinical criteria. Pediatric and general radiologists manually segmented the pancreas, then confirmed by a senior pediatric radiologist. PanSegNet-generated segmentations were assessed using Dice Similarity Coefficient (DSC) and 95th percentile Hausdorff distance (HD95). Cohen's kappa measured observer agreement. Results: Pancreas MRI T2W scans were obtained from 42 children with AP/CP (mean age: 11.73 +/- 3.9 years) and 42 healthy children (mean age: 11.19 +/- 4.88 years). PanSegNet achieved DSC scores of 88% (controls), 81% (AP), and 80% (CP), with HD95 values of 3.98 mm (controls), 9.85 mm (AP), and 15.67 mm (CP). Inter-observer kappa was 0.86 (controls), 0.82 (pancreatitis), and intra-observer agreement reached 0.88 and 0.81. Strong agreement was observed between automated and manual volumes (R^2 = 0.85 in controls, 0.77 in diseased), demonstrating clinical reliability. Conclusion: PanSegNet represents the first validated deep learning solution for pancreatic MRI segmentation, achieving expert-level performance across healthy and diseased states. This tool, algorithm, along with our annotated dataset, are freely available on GitHub and OSF, advancing accessible, radiation-free pediatric pancreatic imaging and fostering collaborative research in this underserved domain.

DiffM<sup>4</sup>RI: A Latent Diffusion Model with Modality Inpainting for Synthesizing Missing Modalities in MRI Analysis.

Ye W, Guo Z, Ren Y, Tian Y, Shen Y, Chen Z, He J, Ke J, Shen Y

pubmed logopapersJun 17 2025
Foundation Models (FMs) have shown great promise for multimodal medical image analysis such as Magnetic Resonance Imaging (MRI). However, certain MRI sequences may be unavailable due to various constraints, such as limited scanning time, patient discomfort, or scanner limitations. The absence of certain modalities can hinder the performance of FMs in clinical applications, making effective missing modality imputation crucial for ensuring their applicability. Previous approaches, including generative adversarial networks (GANs), have been employed to synthesize missing modalities in either a one-to-one or many-to-one manner. However, these methods have limitations, as they require training a new model for different missing scenarios and are prone to mode collapse, generating limited diversity in the synthesized images. To address these challenges, we propose DiffM<sup>4</sup>RI, a diffusion model for many-to-many missing modality imputation in MRI. DiffM<sup>4</sup>RI innovatively formulates the missing modality imputation as a modality-level inpainting task, enabling it to handle arbitrary missing modality situations without the need for training multiple networks. Experiments on the BraTs datasets demonstrate DiffM<sup>4</sup>RI can achieve an average SSIM improvement of 0.15 over MustGAN, 0.1 over SynDiff, and 0.02 over VQ-VAE-2. These results highlight the potential of DiffM<sup>4</sup>RI in enhancing the reliability of FMs in clinical applications. The code is available at https://github.com/27yw/DiffM4RI.

TCFNet: Bidirectional face-bone transformation via a Transformer-based coarse-to-fine point movement network.

Zhang R, Jie B, He Y, Wang J

pubmed logopapersJun 16 2025
Computer-aided surgical simulation is a critical component of orthognathic surgical planning, where accurately simulating face-bone shape transformations is significant. The traditional biomechanical simulation methods are limited by their computational time consumption levels, labor-intensive data processing strategies and low accuracy. Recently, deep learning-based simulation methods have been proposed to view this problem as a point-to-point transformation between skeletal and facial point clouds. However, these approaches cannot process large-scale points, have limited receptive fields that lead to noisy points, and employ complex preprocessing and postprocessing operations based on registration. These shortcomings limit the performance and widespread applicability of such methods. Therefore, we propose a Transformer-based coarse-to-fine point movement network (TCFNet) to learn unique, complicated correspondences at the patch and point levels for dense face-bone point cloud transformations. This end-to-end framework adopts a Transformer-based network and a local information aggregation network (LIA-Net) in the first and second stages, respectively, which reinforce each other to generate precise point movement paths. LIA-Net can effectively compensate for the neighborhood precision loss of the Transformer-based network by modeling local geometric structures (edges, orientations and relative position features). The previous global features are employed to guide the local displacement using a gated recurrent unit. Inspired by deformable medical image registration, we propose an auxiliary loss that can utilize expert knowledge for reconstructing critical organs. Our framework is an unsupervised algorithm, and this loss is optional. Compared with the existing state-of-the-art (SOTA) methods on gathered datasets, TCFNet achieves outstanding evaluation metrics and visualization results. The code is available at https://github.com/Runshi-Zhang/TCFNet.

Kernelized weighted local information based picture fuzzy clustering with multivariate coefficient of variation and modified total Bregman divergence measure for brain MRI image segmentation.

Lohit H, Kumar D

pubmed logopapersJun 16 2025
This paper proposes a novel clustering method for noisy image segmentation using a kernelized weighted local information approach under the Picture Fuzzy Set (PFS) framework. Existing kernel-based fuzzy clustering methods struggle with noisy environments and non-linear structures, while intuitionistic fuzzy clustering methods face limitations in handling uncertainty in real-world medical images. To address these challenges, we introduce a local picture fuzzy information measure, developed for the first time using Multivariate Coefficient of Variation (MCV) theory, enhancing robustness in segmentation. Additionally, we integrate non-Euclidean distance measures, including kernel distance for local information computation and modified total Bregman divergence (MTBD) measure for improving clustering accuracy. This combination enhances both local spatial consistency and global membership estimation, leading to precise segmentation. The proposed method is extensively evaluated on synthetic images with Gaussian, Salt and Pepper, and mixed noise, along with Brainweb, IBSR, and MRBrainS18 MRI datasets under varying Rician noise levels, and a CT image template. Furthermore, we benchmark our proposed method against two deep learning-based segmentation models, ResNet34-LinkNet and patch-based U-Net. Experimental results demonstrate significant improvements in segmentation accuracy, as validated by metrics such as Dice Score, Fuzzy Performance Index, Modified Partition Entropy, Average Volume Difference (AVD), and the XB index. Additionally, Friedman's statistical test confirms the superior performance of our approach compared to state-of-the-art clustering methods for noisy image segmentation. To facilitate reproducibility, the implementation of our proposed method is made publicly available at: Google Drive Repository.

PRO: Projection Domain Synthesis for CT Imaging

Kang Chen, Bin Huang, Xuebin Yang, Junyan Zhang, Qiegen Liu

arxiv logopreprintJun 16 2025
Synthesizing high quality CT images remains a signifi-cant challenge due to the limited availability of annotat-ed data and the complex nature of CT imaging. In this work, we present PRO, a novel framework that, to the best of our knowledge, is the first to perform CT image synthesis in the projection domain using latent diffusion models. Unlike previous approaches that operate in the image domain, PRO learns rich structural representa-tions from raw projection data and leverages anatomi-cal text prompts for controllable synthesis. This projec-tion domain strategy enables more faithful modeling of underlying imaging physics and anatomical structures. Moreover, PRO functions as a foundation model, capa-ble of generalizing across diverse downstream tasks by adjusting its generative behavior via prompt inputs. Experimental results demonstrated that incorporating our synthesized data significantly improves perfor-mance across multiple downstream tasks, including low-dose and sparse-view reconstruction, even with limited training data. These findings underscore the versatility and scalability of PRO in data generation for various CT applications. These results highlight the potential of projection domain synthesis as a powerful tool for data augmentation and robust CT imaging. Our source code is publicly available at: https://github.com/yqx7150/PRO.

Beyond the First Read: AI-Assisted Perceptual Error Detection in Chest Radiography Accounting for Interobserver Variability

Adhrith Vutukuri, Akash Awasthi, David Yang, Carol C. Wu, Hien Van Nguyen

arxiv logopreprintJun 16 2025
Chest radiography is widely used in diagnostic imaging. However, perceptual errors -- especially overlooked but visible abnormalities -- remain common and clinically significant. Current workflows and AI systems provide limited support for detecting such errors after interpretation and often lack meaningful human--AI collaboration. We introduce RADAR (Radiologist--AI Diagnostic Assistance and Review), a post-interpretation companion system. RADAR ingests finalized radiologist annotations and CXR images, then performs regional-level analysis to detect and refer potentially missed abnormal regions. The system supports a "second-look" workflow and offers suggested regions of interest (ROIs) rather than fixed labels to accommodate inter-observer variation. We evaluated RADAR on a simulated perceptual-error dataset derived from de-identified CXR cases, using F1 score and Intersection over Union (IoU) as primary metrics. RADAR achieved a recall of 0.78, precision of 0.44, and an F1 score of 0.56 in detecting missed abnormalities in the simulated perceptual-error dataset. Although precision is moderate, this reduces over-reliance on AI by encouraging radiologist oversight in human--AI collaboration. The median IoU was 0.78, with more than 90% of referrals exceeding 0.5 IoU, indicating accurate regional localization. RADAR effectively complements radiologist judgment, providing valuable post-read support for perceptual-error detection in CXR interpretation. Its flexible ROI suggestions and non-intrusive integration position it as a promising tool in real-world radiology workflows. To facilitate reproducibility and further evaluation, we release a fully open-source web implementation alongside a simulated error dataset. All code, data, demonstration videos, and the application are publicly available at https://github.com/avutukuri01/RADAR.

PRO: Projection Domain Synthesis for CT Imaging

Kang Chen, Bin Huang, Xuebin Yang, Junyan Zhang, Qiegen Liu

arxiv logopreprintJun 16 2025
Synthesizing high quality CT projection data remains a significant challenge due to the limited availability of annotated data and the complex nature of CT imaging. In this work, we present PRO, a projection domain synthesis foundation model for CT imaging. To the best of our knowledge, this is the first study that performs CT synthesis in the projection domain. Unlike previous approaches that operate in the image domain, PRO learns rich structural representations from raw projection data and leverages anatomical text prompts for controllable synthesis. This projection domain strategy enables more faithful modeling of underlying imaging physics and anatomical structures. Moreover, PRO functions as a foundation model, capable of generalizing across diverse downstream tasks by adjusting its generative behavior via prompt inputs. Experimental results demonstrated that incorporating our synthesized data significantly improves performance across multiple downstream tasks, including low-dose and sparse-view reconstruction. These findings underscore the versatility and scalability of PRO in data generation for various CT applications. These results highlight the potential of projection domain synthesis as a powerful tool for data augmentation and robust CT imaging. Our source code is publicly available at: https://github.com/yqx7150/PRO.
Page 6 of 22215 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.