Latest Papers on Radiology AI. Tags: Mixed Modality

Patient Preferences for Artificial Intelligence in Medical Imaging: A Single-Center Cross-Sectional Survey.

McGhee KN, Barrett DJ, Safarini O, Elkassem AA, Eddins JT, Smith AD, Rothenberg SA

•papers•Aug 7 2025

Artificial Intelligence (AI) is rapidly being implemented into clinical practice to improve diagnostic accuracy and reduce provider burnout. However, patient self-perceived knowledge and perceptions of AI's role in their care remain unclear. This study aims to explore patient preferences regarding the use of and communication of AI in their care for patients undergoing cross-sectional imaging exams. This single-center cross-sectional study, a structured questionnaire recruited patients undergoing outpatient CT or MRI examinations between June and July 2024 to assess baseline self-perceived knowledge of AI, perspectives on AI in clinical care, preferences regarding AI-generated results, and economic considerations related to AI, using Likert scales and categorical questions. A total of 226 participants (143 females; mean age 53 years) were surveyed with 67.4% (151/224) reporting having minimal to no knowledge of AI in medicine, with lower knowledge levels associated with lower socioeconomic status (p < .001). 90.3% (204/226) believed they should be informed about the use of AI in their care, and 91.1% (204/224) supported the right to opt out. Additionally, 91.1% (204/224) of participants expressed a strong preference for being informed when AI was involved in interpreting their medical images. 65.6% (143/218) indicated that they would not accept a screening imaging exam exclusively interpreted by an AI algorithm. Finally, 91.1% (204/224) of participants wanted disclosure when AI was used and 89.1% (196/220) felt such disclosure and clarification of discrepancies should be considered standard care. To align AI adoption with patient preferences and expectations, radiology practices must prioritize disclosure, patient engagement, and standardized documentation of AI use without being overly burdensome to the diagnostic workflow. Patients prefer transparency for AI utilization in their care, and our study highlights the discrepancy between patient preferences and current clinical practice. Patients are not expected to determine the technical aspects of an image examination such as acquisition parameters or reconstruction kernel and must trust their providers to act in their best interest. Clear communication of how AI is being used in their care should be provided in ways that do not overly burden the radiologist.

Mixed Modality Retrospective Clinical Post Market Academic Lab Ethics Policy

Role of AI in Clinical Decision-Making: An Analysis of FDA Medical Device Approvals.

Fernando P, Lyell D, Wang Y, Magrabi F

•papers•Aug 7 2025

The U.S. Food and Drug Administration (FDA) plays an important role in ensuring safety and effectiveness of AI/ML-enabled devices through its regulatory processes. In recent years, there has been an increase in the number of these devices cleared by FDA. This study analyzes 104 FDA-approved ML-enabled medical devices from May 2021 to April 2023, extending previous research to provide a contemporary perspective on this evolving landscape. We examined clinical task, device task, device input and output, ML method and level of autonomy. Most approvals (n = 103) were via the 510(k) premarket notification pathway, indicating substantial equivalence to existing devices. Devices predominantly supported diagnostic tasks (n = 81). The majority of devices used imaging data (n = 99), with CT and MRI being the most common modalities. Device autonomy levels were distributed as follows: 52% assistive (requiring users to confirm or approve AI provided information or decision), 27% autonomous information, and 21% autonomous decision. The prevalence of assistive devices indicates a cautious approach to integrating ML into clinical decision-making, favoring support rather than replacement of human judgment.

Mixed Modality Classification Whole Body Meta Analysis FDA Cleared FDA 510(k)Academic Lab Policy

Memory-enhanced and multi-domain learning-based deep unrolling network for medical image reconstruction.

Jiang H, Zhang Q, Hu Y, Jin Y, Liu H, Chen Z, Yumo Z, Fan W, Zheng HR, Liang D, Hu Z

•papers•Aug 7 2025

Reconstructing high-quality images from corrupted measurements remains a fundamental challenge in medical imaging. Recently, deep unrolling (DUN) methods have emerged as a promising solution, combining the interpretability of traditional iterative algorithms with the powerful representation capabilities of deep learning. However, their performance is often limited by weak information flow between iterative stages and a constrained ability to capture global features across stages-limitations that tend to worsen as the number of iterations increases.Approach: In this work, we propose a memory-enhanced and multi-domain learning-based deep unrolling network for interpretable, high-fidelity medical image reconstruction. First, a memory-enhanced module is designed to adaptively integrate historical outputs across stages, reducing information loss. Second, we introduce a cross-stage spatial-domain learning transformer (CS-SLFormer) to extract both local and non-local features within and across stages, improving reconstruction performance. Finally, a frequency-domain consistency learning (FDCL) module ensures alignment between reconstructed and ground truth images in the frequency domain, recovering fine image details.Main Results: Comprehensive experiments evaluated on three representative medical imaging modalities (PET, MRI, and CT) show that the proposed method consistently outperforms state-of-the-art (SOTA) approaches in both quantitative metrics and visual quality. Specifically, our method achieved a PSNR of 37.835 dB and an SSIM of 0.970 in 1 $\%$ dose PET reconstruction.Significance: This study expands the use of model-driven deep learning in medical imaging, demonstrating the potential of memory-enhanced deep unrolling frameworks for high-quality reconstructions.

Mixed Modality Reconstruction Methodology In Silico Academic Lab Benchmark SOTA

MM2CT: MR-to-CT translation for multi-modal image fusion with mamba

Chaohui Gong, Zhiying Wu, Zisheng Huang, Gaofeng Meng, Zhen Lei, Hongbin Liu

•preprint•Aug 7 2025

Magnetic resonance (MR)-to-computed tomography (CT) translation offers significant advantages, including the elimination of radiation exposure associated with CT scans and the mitigation of imaging artifacts caused by patient motion. The existing approaches are based on single-modality MR-to-CT translation, with limited research exploring multimodal fusion. To address this limitation, we introduce Multi-modal MR to CT (MM2CT) translation method by leveraging multimodal T1- and T2-weighted MRI data, an innovative Mamba-based framework for multi-modal medical image synthesis. Mamba effectively overcomes the limited local receptive field in CNNs and the high computational complexity issues in Transformers. MM2CT leverages this advantage to maintain long-range dependencies modeling capabilities while achieving multi-modal MR feature integration. Additionally, we incorporate a dynamic local convolution module and a dynamic enhancement module to improve MRI-to-CT synthesis. The experiments on a public pelvis dataset demonstrate that MM2CT achieves state-of-the-art performance in terms of Structural Similarity Index Measure (SSIM) and Peak Signal-to-Noise Ratio (PSNR). Our code is publicly available at https://github.com/Gots-ch/MM2CT.

Mixed Modality Image Synthesis Abdominal Methodology In Silico Academic Lab Open Code Benchmark SOTA

MoMA: A Mixture-of-Multimodal-Agents Architecture for Enhancing Clinical Prediction Modelling

Jifan Gao, Mahmudur Rahman, John Caskey, Madeline Oguss, Ann O'Rourke, Randy Brown, Anne Stey, Anoop Mayampurath, Matthew M. Churpek, Guanhua Chen, Majid Afshar

•preprint•Aug 7 2025

Multimodal electronic health record (EHR) data provide richer, complementary insights into patient health compared to single-modality data. However, effectively integrating diverse data modalities for clinical prediction modeling remains challenging due to the substantial data requirements. We introduce a novel architecture, Mixture-of-Multimodal-Agents (MoMA), designed to leverage multiple large language model (LLM) agents for clinical prediction tasks using multimodal EHR data. MoMA employs specialized LLM agents ("specialist agents") to convert non-textual modalities, such as medical images and laboratory results, into structured textual summaries. These summaries, together with clinical notes, are combined by another LLM ("aggregator agent") to generate a unified multimodal summary, which is then used by a third LLM ("predictor agent") to produce clinical predictions. Evaluating MoMA on three prediction tasks using real-world datasets with different modality combinations and prediction settings, MoMA outperforms current state-of-the-art methods, highlighting its enhanced accuracy and flexibility across various tasks.

Mixed Modality Classification Methodology In Silico Academic Lab GenAI

MedMambaLite: Hardware-Aware Mamba for Medical Image Classification

Romina Aalishah, Mozhgan Navardi, Tinoosh Mohsenin

•preprint•Aug 7 2025

AI-powered medical devices have driven the need for real-time, on-device inference such as biomedical image classification. Deployment of deep learning models at the edge is now used for applications such as anomaly detection and classification in medical images. However, achieving this level of performance on edge devices remains challenging due to limitations in model size and computational capacity. To address this, we present MedMambaLite, a hardware-aware Mamba-based model optimized through knowledge distillation for medical image classification. We start with a powerful MedMamba model, integrating a Mamba structure for efficient feature extraction in medical imaging. We make the model lighter and faster in training and inference by modifying and reducing the redundancies in the architecture. We then distill its knowledge into a smaller student model by reducing the embedding dimensions. The optimized model achieves 94.5% overall accuracy on 10 MedMNIST datasets. It also reduces parameters 22.8x compared to MedMamba. Deployment on an NVIDIA Jetson Orin Nano achieves 35.6 GOPS/J energy per inference. This outperforms MedMamba by 63% improvement in energy per inference.

Mixed Modality Classification Methodology In Silico Academic Lab Benchmark SOTA

MedCLIP-SAMv2: Towards universal text-driven medical image segmentation.

Koleilat T, Asgariandehkordi H, Rivaz H, Xiao Y

•papers•Aug 7 2025

Segmentation of anatomical structures and pathologies in medical images is essential for modern disease diagnosis, clinical research, and treatment planning. While significant advancements have been made in deep learning-based segmentation techniques, many of these methods still suffer from limitations in data efficiency, generalizability, and interactivity. As a result, developing robust segmentation methods that require fewer labeled datasets remains a critical challenge in medical image analysis. Recently, the introduction of foundation models like CLIP and Segment-Anything-Model (SAM), with robust cross-domain representations, has paved the way for interactive and universal image segmentation. However, further exploration of these models for data-efficient segmentation in medical imaging is an active field of research. In this paper, we introduce MedCLIP-SAMv2, a novel framework that integrates the CLIP and SAM models to perform segmentation on clinical scans using text prompts, in both zero-shot and weakly supervised settings. Our approach includes fine-tuning the BiomedCLIP model with a new Decoupled Hard Negative Noise Contrastive Estimation (DHN-NCE) loss, and leveraging the Multi-modal Information Bottleneck (M2IB) to create visual prompts for generating segmentation masks with SAM in the zero-shot setting. We also investigate using zero-shot segmentation labels in a weakly supervised paradigm to enhance segmentation quality further. Extensive validation across four diverse segmentation tasks and medical imaging modalities (breast tumor ultrasound, brain tumor MRI, lung X-ray, and lung CT) demonstrates the high accuracy of our proposed framework. Our code is available at https://github.com/HealthX-Lab/MedCLIP-SAMv2.

Mixed Modality Segmentation Methodology In Silico Academic Lab Open Code GenAI

RegionMed-CLIP: A Region-Aware Multimodal Contrastive Learning Pre-trained Model for Medical Image Understanding

Tianchen Fang, Guiru Liu

•preprint•Aug 7 2025

Medical image understanding plays a crucial role in enabling automated diagnosis and data-driven clinical decision support. However, its progress is impeded by two primary challenges: the limited availability of high-quality annotated medical data and an overreliance on global image features, which often miss subtle but clinically significant pathological regions. To address these issues, we introduce RegionMed-CLIP, a region-aware multimodal contrastive learning framework that explicitly incorporates localized pathological signals along with holistic semantic representations. The core of our method is an innovative region-of-interest (ROI) processor that adaptively integrates fine-grained regional features with the global context, supported by a progressive training strategy that enhances hierarchical multimodal alignment. To enable large-scale region-level representation learning, we construct MedRegion-500k, a comprehensive medical image-text corpus that features extensive regional annotations and multilevel clinical descriptions. Extensive experiments on image-text retrieval, zero-shot classification, and visual question answering tasks demonstrate that RegionMed-CLIP consistently exceeds state-of-the-art vision language models by a wide margin. Our results highlight the critical importance of region-aware contrastive pre-training and position RegionMed-CLIP as a robust foundation for advancing multimodal medical image understanding.

Mixed Modality Classification Methodology In Silico Academic Lab Open Dataset Benchmark SOTA

Coarse-to-Fine Joint Registration of MR and Ultrasound Images via Imaging Style Transfer

Junyi Wang, Xi Zhu, Yikun Guo, Zixi Wang, Haichuan Gao, Le Zhang, Fan Zhang

•preprint•Aug 7 2025

We developed a pipeline for registering pre-surgery Magnetic Resonance (MR) images and post-resection Ultrasound (US) images. Our approach leverages unpaired style transfer using 3D CycleGAN to generate synthetic T1 images, thereby enhancing registration performance. Additionally, our registration process employs both affine and local deformable transformations for a coarse-to-fine registration. The results demonstrate that our approach improves the consistency between MR and US image pairs in most cases.

Mixed Modality Registration Neurological Methodology In Silico GenAI

MLAgg-UNet: Advancing Medical Image Segmentation with Efficient Transformer and Mamba-Inspired Multi-Scale Sequence.

Jiang J, Lei S, Li H, Sun Y

•papers•Aug 7 2025

Transformers and state space sequence models (SSMs) have attracted interest in biomedical image segmentation for their ability to capture long-range dependency. However, traditional visual state space (VSS) methods suffer from the incompatibility of image tokens with autoregressive assumption. Although Transformer attention does not require this assumption, its high computational cost limits effective channel-wise information utilization. To overcome these limitations, we propose the Mamba-Like Aggregated UNet (MLAgg-UNet), which introduces Mamba-inspired mechanism to enrich Transformer channel representation and exploit implicit autoregressive characteristic within U-shaped architecture. For establishing dependencies among image tokens in single scale, the Mamba-Like Aggregated Attention (MLAgg) block is designed to balance representational ability and computational efficiency. Inspired by the human foveal vision system, Mamba macro-structure, and differential attention, MLAgg block can slide its focus over each image token, suppress irrelevant tokens, and simultaneously strengthen channel-wise information utilization. Moreover, leveraging causal relationships between consecutive low-level and high-level features in U-shaped architecture, we propose the Multi-Scale Mamba Module with Implicit Causality (MSMM) to optimize complementary information across scales. Embedded within skip connections, this module enhances semantic consistency between encoder and decoder features. Extensive experiments on four benchmark datasets, including AbdomenMRI, ACDC, BTCV, and EndoVis17, which cover MRI, CT, and endoscopy modalities, demonstrate that the proposed MLAgg-UNet consistently outperforms state-of-the-art CNN-based, Transformer-based, and Mamba-based methods. Specifically, it achieves improvements of at least 1.24%, 0.20%, 0.33%, and 0.39% in DSC scores on these datasets, respectively. These results highlight the model's ability to effectively capture feature correlations and integrate complementary multi-scale information, providing a robust solution for medical image segmentation. The implementation is publicly available at https://github.com/aticejiang/MLAgg-UNet.

Mixed Modality Segmentation Abdominal Methodology In Silico Academic Lab Open Code

Filter Papers

Tags

Patient Preferences for Artificial Intelligence in Medical Imaging: A Single-Center Cross-Sectional Survey.

Role of AI in Clinical Decision-Making: An Analysis of FDA Medical Device Approvals.

Memory-enhanced and multi-domain learning-based deep unrolling network for medical image reconstruction.

MM2CT: MR-to-CT translation for multi-modal image fusion with mamba

MoMA: A Mixture-of-Multimodal-Agents Architecture for Enhancing Clinical Prediction Modelling

MedMambaLite: Hardware-Aware Mamba for Medical Image Classification

MedCLIP-SAMv2: Towards universal text-driven medical image segmentation.

RegionMed-CLIP: A Region-Aware Multimodal Contrastive Learning Pre-trained Model for Medical Image Understanding

Coarse-to-Fine Joint Registration of MR and Ultrasound Images via Imaging Style Transfer

MLAgg-UNet: Advancing Medical Image Segmentation with Efficient Transformer and Mamba-Inspired Multi-Scale Sequence.

Ready to Sharpen Your Edge?