Latest Papers on Radiology AI. Tags: Mixed Modality

CLT-MambaSeg: An integrated model of Convolution, Linear Transformer and Multiscale Mamba for medical image segmentation.

Uppal D, Prakash S

•papers•Jul 26 2025

Recent advances in deep learning have significantly enhanced the performance of medical image segmentation. However, maintaining a balanced integration of feature localization, global context modeling, and computational efficiency remains a critical research challenge. Convolutional Neural Networks (CNNs) effectively capture fine-grained local features through hierarchical convolutions; however, they often struggle to model long-range dependencies due to their limited receptive field. Transformers address this limitation by leveraging self-attention mechanisms to capture global context, but they are computationally intensive and require large-scale data for effective training. The Mamba architecture has emerged as a promising approach, effectively capturing long-range dependencies while maintaining low computational overhead and high segmentation accuracy. Based on this, we propose a method named CLT-MambaSeg that integrates Convolution, Linear Transformer, and Multiscale Mamba architectures to capture local features, model global context, and improve computational efficiency for medical image segmentation. It utilizes a convolution-based Spatial Representation Extraction (SREx) module to capture intricate spatial relationships and dependencies. Further, it comprises a Mamba Vision Linear Transformer (MVLTrans) module to capture multiscale context, spatial and sequential dependencies, and enhanced global context. In addition, to address the problem of limited data, we propose a novel Memory-Guided Augmentation Generative Adversarial Network (MeGA-GAN) that generates synthetic realistic images to further enhance the segmentation performance. We conduct extensive experiments and ablation studies on the five benchmark datasets, namely CVC-ClinicDB, Breast UltraSound Images (BUSI), PH2, and two datasets from the International Skin Imaging Collaboration (ISIC), namely ISIC-2016 and ISIC-2017. Experimental results demonstrate the efficacy of the proposed CLT-MambaSeg compared to other state-of-the-art methods.

Mixed Modality Segmentation Methodology In Silico Benchmark SOTA

Deep learning-based image classification for integrating pathology and radiology in AI-assisted medical imaging.

Lu C, Zhang J, Liu R

•papers•Jul 25 2025

The integration of pathology and radiology in medical imaging has emerged as a critical need for advancing diagnostic accuracy and improving clinical workflows. Current AI-driven approaches for medical image analysis, despite significant progress, face several challenges, including handling multi-modal imaging, imbalanced datasets, and the lack of robust interpretability and uncertainty quantification. These limitations often hinder the deployment of AI systems in real-world clinical settings, where reliability and adaptability are essential. To address these issues, this study introduces a novel framework, the Domain-Informed Adaptive Network (DIANet), combined with an Adaptive Clinical Workflow Integration (ACWI) strategy. DIANet leverages multi-scale feature extraction, domain-specific priors, and Bayesian uncertainty modeling to enhance interpretability and robustness. The proposed model is tailored for multi-modal medical imaging tasks, integrating adaptive learning mechanisms to mitigate domain shifts and imbalanced datasets. Complementing the model, the ACWI strategy ensures seamless deployment through explainable AI (XAI) techniques, uncertainty-aware decision support, and modular workflow integration compatible with clinical systems like PACS. Experimental results demonstrate significant improvements in diagnostic accuracy, segmentation precision, and reconstruction fidelity across diverse imaging modalities, validating the potential of this framework to bridge the gap between AI innovation and clinical utility.

Mixed Modality Classification Methodology In Silico Academic Lab Benchmark SOTA GenAI

Exploring AI-Based System Design for Pixel-Level Protected Health Information Detection in Medical Images.

Truong T, Baltruschat IM, Klemens M, Werner G, Lenga M

•papers•Jul 25 2025

De-identification of medical images is a critical step to ensure privacy during data sharing in research and clinical settings. The initial step in this process involves detecting Protected Health Information (PHI), which can be found in image metadata or imprinted within image pixels. Despite the importance of such systems, there has been limited evaluation of existing AI-based solutions, creating barriers to the development of reliable and robust tools. In this study, we present an AI-based pipeline for PHI detection, comprising three key modules: text detection, text extraction, and text analysis. We benchmark three models-YOLOv11, EasyOCR, and GPT-4o- across different setups corresponding to these modules, evaluating their performance on two different datasets encompassing multiple imaging modalities and PHI categories. Our findings indicate that the optimal setup involves utilizing dedicated vision and language models for each module, which achieves a commendable balance in performance, latency, and cost associated with the usage of large language models (LLMs). Additionally, we show that the application of LLMs not only involves identifying PHI content but also enhances OCR tasks and facilitates an end-to-end PHI detection pipeline, showcasing promising outcomes through our analysis.

Mixed Modality Detection Methodology In Silico Academic Lab GenAI Benchmark SOTA

DeepJIVE: Learning Joint and Individual Variation Explained from Multimodal Data Using Deep Learning

Matthew Drexler, Benjamin Risk, James J Lah, Suprateek Kundu, Deqiang Qiu

•preprint•Jul 25 2025

Conventional multimodal data integration methods provide a comprehensive assessment of the shared or unique structure within each individual data type but suffer from several limitations such as the inability to handle high-dimensional data and identify nonlinear structures. In this paper, we introduce DeepJIVE, a deep-learning approach to performing Joint and Individual Variance Explained (JIVE). We perform mathematical derivation and experimental validations using both synthetic and real-world 1D, 2D, and 3D datasets. Different strategies of achieving the identity and orthogonality constraints for DeepJIVE were explored, resulting in three viable loss functions. We found that DeepJIVE can successfully uncover joint and individual variations of multimodal datasets. Our application of DeepJIVE to the Alzheimer's Disease Neuroimaging Initiative (ADNI) also identified biologically plausible covariation patterns between the amyloid positron emission tomography (PET) and magnetic resonance (MR) images. In conclusion, the proposed DeepJIVE can be a useful tool for multimodal data analysis.

Mixed Modality Classification Neurological Methodology In Silico Academic Lab GenAI

Is Exchangeability better than I.I.D to handle Data Distribution Shifts while Pooling Data for Data-scarce Medical image segmentation?

Ayush Roy, Samin Enam, Jun Xia, Vishnu Suresh Lokhande, Won Hwa Kim

•preprint•Jul 25 2025

Data scarcity is a major challenge in medical imaging, particularly for deep learning models. While data pooling (combining datasets from multiple sources) and data addition (adding more data from a new dataset) have been shown to enhance model performance, they are not without complications. Specifically, increasing the size of the training dataset through pooling or addition can induce distributional shifts, negatively affecting downstream model performance, a phenomenon known as the "Data Addition Dilemma". While the traditional i.i.d. assumption may not hold in multi-source contexts, assuming exchangeability across datasets provides a more practical framework for data pooling. In this work, we investigate medical image segmentation under these conditions, drawing insights from causal frameworks to propose a method for controlling foreground-background feature discrepancies across all layers of deep networks. This approach improves feature representations, which are crucial in data-addition scenarios. Our method achieves state-of-the-art segmentation performance on histopathology and ultrasound images across five datasets, including a novel ultrasound dataset that we have curated and contributed. Qualitative results demonstrate more refined and accurate segmentation maps compared to prominent baselines across three model architectures. The code will be available on Github.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA Open Code Open Dataset

3D-WDA-PMorph: Efficient 3D MRI/TRUS Prostate Registration using Transformer-CNN Network and Wavelet-3D-Depthwise-Attention.

Mahmoudi H, Ramadan H, Riffi J, Tairi H

•papers•Jul 25 2025

Multimodal image registration is crucial in medical imaging, particularly for aligning Magnetic Resonance Imaging (MRI) and Transrectal Ultrasound (TRUS) data, which are widely used in prostate cancer diagnosis and treatment planning. However, this task presents significant challenges due to the inherent differences between these imaging modalities, including variations in resolution, contrast, and noise. Recently, conventional Convolutional Neural Network (CNN)-based registration methods, while effective at extracting local features, often struggle to capture global contextual information and fail to adapt to complex deformations in multimodal data. Conversely, Transformer-based methods excel at capturing long-range dependencies and hierarchical features but face difficulties in integrating fine-grained local details, which are essential for accurate spatial alignment. To address these limitations, we propose a novel 3D image registration framework that combines the strengths of both paradigms. Our method employs a Swin Transformer (ST)-CNN encoder-decoder architecture, with a key innovation focusing on enhancing the skip connection stages. Specifically, we introduce an innovative module named Wavelet-3D-Depthwise-Attention (WDA). The WDA module leverages an attention mechanism that integrates wavelet transforms for multi-scale spatial-frequency representation and 3D-Depthwise convolution to improve computational efficiency and modality fusion. Experimental evaluations on clinical MRI/TRUS datasets confirm that the proposed method achieves a median Dice score of 0.94 and a target registration error of 0.85, indicating an improvement in registration accuracy and robustness over existing state-of-the-art (SOTA) methods. The WDA-enhanced skip connections significantly empower the registration network to preserve critical anatomical details, making our method a promising advancement in prostate multimodal registration. Furthermore, the proposed framework shows strong potential for generalization to other image registration tasks.

Mixed Modality Registration Abdominal Methodology In Silico

Privacy-Preserving Generation of Structured Lymphoma Progression Reports from Cross-sectional Imaging: A Comparative Analysis of Llama 3.3 and Llama 4.

Prucker P, Bressem KK, Kim SH, Weller D, Kader A, Dorfner FJ, Ziegelmayer S, Graf MM, Lemke T, Gassert F, Can E, Meddeb A, Truhn D, Hadamitzky M, Makowski MR, Adams LC, Busch F

•papers•Jul 25 2025

Efficient processing of radiology reports for monitoring disease progression is crucial in oncology. Although large language models (LLMs) show promise in extracting structured information from medical reports, privacy concerns limit their clinical implementation. This study evaluates the feasibility and accuracy of two of the most recent Llama models for generating structured lymphoma progression reports from cross-sectional imaging data in a privacy-preserving, real-world clinical setting. This single-center, retrospective study included adult lymphoma patients who underwent cross-sectional imaging and treatment between July 2023 and July 2024. We established a chain-of-thought prompting strategy to leverage the locally deployed Llama-3.3-70B-Instruct and Llama-4-Scout-17B-16E-Instruct models to generate lymphoma disease progression reports across three iterations. Two radiologists independently scored nodal and extranodal involvement, as well as Lugano staging and treatment response classifications. For each LLM and task, we calculated the F1 score, accuracy, recall, precision, and specificity per label, as well as the case-weighted average with 95% confidence intervals (CIs). Both LLMs correctly implemented the template structure for all 65 patients included in this study. Llama-4-Scout-17B-16E-Instruct demonstrated significantly greater accuracy in extracting nodal and extranodal involvement information (nodal: 0.99 [95% CI = 0.98-0.99] vs. 0.97 [95% CI = 0.95-0.96], p < 0.001; extranodal: 0.99 [95% CI = 0.99-1.00] vs. 0.99 [95% CI = 0.98-0.99], p = 0.013). This difference was more pronounced when predicting Lugano stage and treatment response (stage: 0.85 [95% CI = 0.79-0.89] vs. 0.60 [95% CI = 0.53-0.67], p < 0.001; treatment response: 0.88 [95% CI = 0.83-0.92] vs. 0.65 [95% CI = 0.58-0.71], p < 0.001). Neither model produced hallucinations of newly involved nodal or extranodal sites. The highest relative error rates were found when interpreting the level of disease after treatment. In conclusion, privacy-preserving LLMs can effectively extract clinical information from lymphoma imaging reports. While they excel at data extraction, they are limited in their ability to generate new clinical inferences from the extracted information. Our findings suggest their potential utility in streamlining documentation and highlight areas requiring optimization before clinical implementation.

Mixed Modality LLM Radiology Report Retrospective Clinical In Silico Academic Lab GenAI

MedIQA: A Scalable Foundation Model for Prompt-Driven Medical Image Quality Assessment

Siyi Xun, Yue Sun, Jingkun Chen, Zitong Yu, Tong Tong, Xiaohong Liu, Mingxiang Wu, Tao Tan

•preprint•Jul 25 2025

Rapid advances in medical imaging technology underscore the critical need for precise and automated image quality assessment (IQA) to ensure diagnostic accuracy. Existing medical IQA methods, however, struggle to generalize across diverse modalities and clinical scenarios. In response, we introduce MedIQA, the first comprehensive foundation model for medical IQA, designed to handle variability in image dimensions, modalities, anatomical regions, and types. We developed a large-scale multi-modality dataset with plentiful manually annotated quality scores to support this. Our model integrates a salient slice assessment module to focus on diagnostically relevant regions feature retrieval and employs an automatic prompt strategy that aligns upstream physical parameter pre-training with downstream expert annotation fine-tuning. Extensive experiments demonstrate that MedIQA significantly outperforms baselines in multiple downstream tasks, establishing a scalable framework for medical IQA and advancing diagnostic workflows and clinical decision-making.

Mixed Modality Classification Whole Body Methodology In Silico Academic Lab Benchmark SOTA Open Dataset

DGEAHorNet: high-order spatial interaction network with dual cross global efficient attention for medical image segmentation.

Peng H, An X, Chen X, Chen Z

•papers•Jul 24 2025

Medical image segmentation is a complex and challenging task, which aims to accurately segment various structures or abnormal regions in medical images. However, obtaining accurate segmentation results is difficult because of the great uncertainty in the shape, location, and scale of the target region. To address these challenges, we propose a higher-order spatial interaction framework with dual cross global efficient attention (DGEAHorNet), which employs a neural network architecture based on recursive gate convolution to adequately extract multi-scale contextual information from images. Specifically, a Dual Cross-Attentions (DCA) is added to the skip connection that can effectively blend multi-stage encoder features and narrow the semantic gap. In the bottleneck stage, global channel spatial attention module (GCSAM) is used to extract image global information. To obtain better feature representation, we feed the output from the GCSAM into the multi-branch dense layer (SENetV2) for excitation. Furthermore, we adopt Depthwise Over-parameterized Convolutional Layer (DO-Conv) in order to replace the common convolutional layer in the input and output part of our network, then add Efficient Attention (EA) to diminish computational complexity and enhance our model's performance. For evaluating the effectiveness of our proposed DGEAHorNet, we conduct comprehensive experiments on four publicly-available datasets, and achieving 0.9320, 0.9337, 0.9312 and 0.7799 in Dice similarity coefficient on ISIC2018, ISIC2017, CVC-ClinicDB and HRF respectively. Our results show that DGEAHorNet has better performance compared with advanced methods. The code is publicly available at https://github.com/penghaixin/mymodel .

Mixed Modality Segmentation Methodology In Silico Academic Lab Open Code

Artificial intelligence in radiology: 173 commercially available products and their scientific evidence.

Antonissen N, Tryfonos O, Houben IB, Jacobs C, de Rooij M, van Leeuwen KG

•papers•Jul 24 2025

To assess changes in peer-reviewed evidence on commercially available radiological artificial intelligence (AI) products from 2020 to 2023, as a follow-up to a 2020 review of 100 products. A literature review was conducted, covering January 2015 to March 2023, focusing on CE-certified radiological AI products listed on www.healthairegister.com . Papers were categorised using the hierarchical model of efficacy: technical/diagnostic accuracy (levels 1-2), clinical decision-making and patient outcomes (levels 3-5), or socio-economic impact (level 6). Study features such as design, vendor independence, and multicentre/multinational data usage were also examined. By 2023, 173 CE-certified AI products from 90 vendors were identified, compared to 100 products in 2020. Products with peer-reviewed evidence increased from 36% to 66%, supported by 639 papers (up from 237). Diagnostic accuracy studies (level 2) remained predominant, though their share decreased from 65% to 57%. Studies addressing higher-efficacy levels (3-6) remained constant at 22% and 24%, with the number of products supported by such evidence increasing from 18% to 31%. Multicentre studies rose from 30% to 41% (p < 0.01). However, vendor-independent studies decreased (49% to 45%), as did multinational studies (15% to 11%) and prospective designs (19% to 16%), all with p > 0.05. The increase in peer-reviewed evidence and higher levels of evidence per product indicate maturation in the radiological AI market. However, the continued focus on lower-efficacy studies and reductions in vendor independence, multinational data, and prospective designs highlight persistent challenges in establishing unbiased, real-world evidence. Question Evaluating advancements in peer-reviewed evidence for CE-certified radiological AI products is crucial to understand their clinical adoption and impact. Findings CE-certified AI products with peer-reviewed evidence increased from 36% in 2020 to 66% in 2023, but the proportion of higher-level evidence papers (~24%) remained unchanged. Clinical relevance The study highlights increased validation of radiological AI products but underscores a continued lack of evidence on their clinical and socio-economic impact, which may limit these tools' safe and effective implementation into clinical workflows.

Mixed Modality Classification Whole Body Review Post Market CE Mark Consortium Policy Benchmark SOTA

Filter Papers

Tags

CLT-MambaSeg: An integrated model of Convolution, Linear Transformer and Multiscale Mamba for medical image segmentation.

Deep learning-based image classification for integrating pathology and radiology in AI-assisted medical imaging.

Exploring AI-Based System Design for Pixel-Level Protected Health Information Detection in Medical Images.

DeepJIVE: Learning Joint and Individual Variation Explained from Multimodal Data Using Deep Learning

Is Exchangeability better than I.I.D to handle Data Distribution Shifts while Pooling Data for Data-scarce Medical image segmentation?

3D-WDA-PMorph: Efficient 3D MRI/TRUS Prostate Registration using Transformer-CNN Network and Wavelet-3D-Depthwise-Attention.

Privacy-Preserving Generation of Structured Lymphoma Progression Reports from Cross-sectional Imaging: A Comparative Analysis of Llama 3.3 and Llama 4.

MedIQA: A Scalable Foundation Model for Prompt-Driven Medical Image Quality Assessment

DGEAHorNet: high-order spatial interaction network with dual cross global efficient attention for medical image segmentation.

Artificial intelligence in radiology: 173 commercially available products and their scientific evidence.

Ready to Sharpen Your Edge?