Latest Papers on Radiology AI. Tags: Mixed Modality

NERO: Explainable Out-of-Distribution Detection with Neuron-level Relevance

Anju Chhetri, Jari Korhonen, Prashnna Gyawali, Binod Bhattarai

•preprint•Jun 18 2025

Ensuring reliability is paramount in deep learning, particularly within the domain of medical imaging, where diagnostic decisions often hinge on model outputs. The capacity to separate out-of-distribution (OOD) samples has proven to be a valuable indicator of a model's reliability in research. In medical imaging, this is especially critical, as identifying OOD inputs can help flag potential anomalies that might otherwise go undetected. While many OOD detection methods rely on feature or logit space representations, recent works suggest these approaches may not fully capture OOD diversity. To address this, we propose a novel OOD scoring mechanism, called NERO, that leverages neuron-level relevance at the feature layer. Specifically, we cluster neuron-level relevance for each in-distribution (ID) class to form representative centroids and introduce a relevance distance metric to quantify a new sample's deviation from these centroids, enhancing OOD separability. Additionally, we refine performance by incorporating scaled relevance in the bias term and combining feature norms. Our framework also enables explainable OOD detection. We validate its effectiveness across multiple deep learning architectures on the gastrointestinal imaging benchmarks Kvasir and GastroVision, achieving improvements over state-of-the-art OOD detection methods.

Mixed Modality Classification Abdominal Methodology In Silico Academic Lab Benchmark SOTA

Deep learning based colorectal cancer detection in medical images: A comprehensive analysis of datasets, methods, and future directions.

Gülmez B

•papers•Jun 17 2025

This comprehensive review examines the current state and evolution of artificial intelligence applications in colorectal cancer detection through medical imaging from 2019 to 2025. The study presents a quantitative analysis of 110 high-quality publications and 9 publicly accessible medical image datasets used for training and validation. Various convolutional neural network architectures-including ResNet (40 implementations), VGG (18 implementations), and emerging transformer-based models (12 implementations)-for classification, object detection, and segmentation tasks are systematically categorized and evaluated. The investigation encompasses hyperparameter optimization techniques utilized to enhance model performance, with particular focus on genetic algorithms and particle swarm optimization approaches. The role of explainable AI methods in medical diagnosis interpretation is analyzed through visualization techniques such as Grad-CAM and SHAP. Technical limitations, including dataset scarcity, computational constraints, and standardization challenges, are identified through trend analysis. Research gaps in current methodologies are highlighted through comparative assessment of performance metrics across different architectural implementations. Potential future research directions, including multimodal learning and federated learning approaches, are proposed based on publication trend analysis. This review serves as a comprehensive reference for researchers in medical image analysis and clinical practitioners implementing AI-based colorectal cancer detection systems.

Mixed Modality Classification Abdominal Review Concept Academic Lab Benchmark SOTA Open Dataset Ethics

Frequency-Calibrated Membership Inference Attacks on Medical Image Diffusion Models

Xinkai Zhao, Yuta Tokuoka, Junichiro Iwasawa, Keita Oda

•preprint•Jun 17 2025

The increasing use of diffusion models for image generation, especially in sensitive areas like medical imaging, has raised significant privacy concerns. Membership Inference Attack (MIA) has emerged as a potential approach to determine if a specific image was used to train a diffusion model, thus quantifying privacy risks. Existing MIA methods often rely on diffusion reconstruction errors, where member images are expected to have lower reconstruction errors than non-member images. However, applying these methods directly to medical images faces challenges. Reconstruction error is influenced by inherent image difficulty, and diffusion models struggle with high-frequency detail reconstruction. To address these issues, we propose a Frequency-Calibrated Reconstruction Error (FCRE) method for MIAs on medical image diffusion models. By focusing on reconstruction errors within a specific mid-frequency range and excluding both high-frequency (difficult to reconstruct) and low-frequency (less informative) regions, our frequency-selective approach mitigates the confounding factor of inherent image difficulty. Specifically, we analyze the reverse diffusion process, obtain the mid-frequency reconstruction error, and compute the structural similarity index score between the reconstructed and original images. Membership is determined by comparing this score to a threshold. Experiments on several medical image datasets demonstrate that our FCRE method outperforms existing MIA methods.

Mixed Modality Image Synthesis Methodology In Silico Academic Lab Ethics

MultiViT2: A Data-augmented Multimodal Neuroimaging Prediction Framework via Latent Diffusion Model

Bi Yuda, Jia Sihan, Gao Yutong, Abrol Anees, Fu Zening, Calhoun Vince

•preprint•Jun 16 2025

Multimodal medical imaging integrates diverse data types, such as structural and functional neuroimaging, to provide complementary insights that enhance deep learning predictions and improve outcomes. This study focuses on a neuroimaging prediction framework based on both structural and functional neuroimaging data. We propose a next-generation prediction model, \textbf{MultiViT2}, which combines a pretrained representative learning base model with a vision transformer backbone for prediction output. Additionally, we developed a data augmentation module based on the latent diffusion model that enriches input data by generating augmented neuroimaging samples, thereby enhancing predictive performance through reduced overfitting and improved generalizability. We show that MultiViT2 significantly outperforms the first-generation model in schizophrenia classification accuracy and demonstrates strong scalability and portability.

Mixed Modality Classification Neurological Methodology In Silico GenAI

Brain Imaging Foundation Models, Are We There Yet? A Systematic Review of Foundation Models for Brain Imaging and Biomedical Research

Salah Ghamizi, Georgia Kanli, Yu Deng, Magali Perquin, Olivier Keunen

•preprint•Jun 16 2025

Foundation models (FMs), large neural networks pretrained on extensive and diverse datasets, have revolutionized artificial intelligence and shown significant promise in medical imaging by enabling robust performance with limited labeled data. Although numerous surveys have reviewed the application of FM in healthcare care, brain imaging remains underrepresented, despite its critical role in the diagnosis and treatment of neurological diseases using modalities such as MRI, CT, and PET. Existing reviews either marginalize brain imaging or lack depth on the unique challenges and requirements of FM in this domain, such as multimodal data integration, support for diverse clinical tasks, and handling of heterogeneous, fragmented datasets. To address this gap, we present the first comprehensive and curated review of FMs for brain imaging. We systematically analyze 161 brain imaging datasets and 86 FM architectures, providing information on key design choices, training paradigms, and optimizations driving recent advances. Our review highlights the leading models for various brain imaging tasks, summarizes their innovations, and critically examines current limitations and blind spots in the literature. We conclude by outlining future research directions to advance FM applications in brain imaging, with the aim of fostering progress in both clinical and research settings.

Mixed Modality Neurological Review GenAI

Imaging-Based AI for Predicting Lymphovascular Space Invasion in Cervical Cancer: Systematic Review and Meta-Analysis.

She L, Li Y, Wang H, Zhang J, Zhao Y, Cui J, Qiu L

•papers•Jun 16 2025

The role of artificial intelligence (AI) in enhancing the accuracy of lymphovascular space invasion (LVSI) detection in cervical cancer remains debated. This meta-analysis aimed to evaluate the diagnostic accuracy of imaging-based AI for predicting LVSI in cervical cancer. We conducted a comprehensive literature search across multiple databases, including PubMed, Embase, and Web of Science, identifying studies published up to November 9, 2024. Studies were included if they evaluated the diagnostic performance of imaging-based AI models in detecting LVSI in cervical cancer. We used a bivariate random-effects model to calculate pooled sensitivity and specificity with corresponding 95% confidence intervals. Study heterogeneity was assessed using the I2 statistic. Of 403 studies identiﬁed, 16 studies (2514 patients) were included. For the interval validation set, the pooled sensitivity, specificity, and area under the curve (AUC) for detecting LVSI were 0.84 (95% CI 0.79-0.87), 0.78 (95% CI 0.75-0.81), and 0.87 (95% CI 0.84-0.90). For the external validation set, the pooled sensitivity, specificity, and AUC for detecting LVSI were 0.79 (95% CI 0.70-0.86), 0.76 (95% CI 0.67-0.83), and 0.84 (95% CI 0.81-0.87). Using the likelihood ratio test for subgroup analysis, deep learning demonstrated significantly higher sensitivity compared to machine learning (P=.01). Moreover, AI models based on positron emission tomography/computed tomography exhibited superior sensitivity relative to those based on magnetic resonance imaging (P=.01). Imaging-based AI, particularly deep learning algorithms, demonstrates promising diagnostic performance in predicting LVSI in cervical cancer. However, the limited external validation datasets and the retrospective nature of the research may introduce potential biases. These findings underscore AI's potential as an auxiliary diagnostic tool, necessitating further large-scale prospective validation.

Mixed Modality Classification Abdominal Meta Analysis In Silico Academic Lab

TCFNet: Bidirectional face-bone transformation via a Transformer-based coarse-to-fine point movement network.

Zhang R, Jie B, He Y, Wang J

•papers•Jun 16 2025

Computer-aided surgical simulation is a critical component of orthognathic surgical planning, where accurately simulating face-bone shape transformations is significant. The traditional biomechanical simulation methods are limited by their computational time consumption levels, labor-intensive data processing strategies and low accuracy. Recently, deep learning-based simulation methods have been proposed to view this problem as a point-to-point transformation between skeletal and facial point clouds. However, these approaches cannot process large-scale points, have limited receptive fields that lead to noisy points, and employ complex preprocessing and postprocessing operations based on registration. These shortcomings limit the performance and widespread applicability of such methods. Therefore, we propose a Transformer-based coarse-to-fine point movement network (TCFNet) to learn unique, complicated correspondences at the patch and point levels for dense face-bone point cloud transformations. This end-to-end framework adopts a Transformer-based network and a local information aggregation network (LIA-Net) in the first and second stages, respectively, which reinforce each other to generate precise point movement paths. LIA-Net can effectively compensate for the neighborhood precision loss of the Transformer-based network by modeling local geometric structures (edges, orientations and relative position features). The previous global features are employed to guide the local displacement using a gated recurrent unit. Inspired by deformable medical image registration, we propose an auxiliary loss that can utilize expert knowledge for reconstructing critical organs. Our framework is an unsupervised algorithm, and this loss is optional. Compared with the existing state-of-the-art (SOTA) methods on gathered datasets, TCFNet achieves outstanding evaluation metrics and visualization results. The code is available at https://github.com/Runshi-Zhang/TCFNet.

Mixed Modality Registration Methodology In Silico Academic Lab Open Code

FairICP: identifying biases and increasing transparency at the point of care in post-implementation clinical decision support using inductive conformal prediction.

Sun X, Nakashima M, Nguyen C, Chen PH, Tang WHW, Kwon D, Chen D

•papers•Jun 15 2025

Fairness concerns stemming from known and unknown biases in healthcare practices have raised questions about the trustworthiness of Artificial Intelligence (AI)-driven Clinical Decision Support Systems (CDSS). Studies have shown unforeseen performance disparities in subpopulations when applied to clinical settings different from training. Existing unfairness mitigation strategies often struggle with scalability and accessibility, while their pursuit of group-level prediction performance parity does not effectively translate into fairness at the point of care. This study introduces FairICP, a flexible and cost-effective post-implementation framework based on Inductive Conformal Prediction (ICP), to provide users with actionable knowledge of model uncertainty due to subpopulation level biases at the point of care. FairICP applies ICP to identify the model's scope of competence through group specific calibration, ensuring equitable prediction reliability by filtering predictions that fall within the trusted competence boundaries. We evaluated FairICP against four benchmarks on three medical imaging modalities: (1) Cardiac Magnetic Resonance Imaging (MRI), (2) Chest X-ray and (3) Dermatology Imaging, acquired from both private and large public datasets. Frameworks are assessed on prediction performance enhancement and unfairness mitigation capabilities. Compared to the baseline, FairICP improved prediction accuracy by 7.2% and reduced the accuracy gap between the privileged and unprivileged subpopulations by 2.2% on average across all three datasets. Our work provides a robust solution to promote trust and transparency in AI-CDSS, fostering equality and equity in healthcare for diverse patient populations. Such post-process methods are critical to enabling a robust framework for AI-CDSS implementation and monitoring for healthcare settings.

Mixed Modality Classification Methodology In Silico Academic Lab Ethics Policy

A multimodal deep learning model for detecting endoscopic images of near-infrared fluorescence capsules.

Wang J, Zhou C, Wang W, Zhang H, Zhang A, Cui D

•papers•Jun 15 2025

Early screening for gastrointestinal (GI) diseases is critical for preventing cancer development. With the rapid advancement of deep learning technology, artificial intelligence (AI) has become increasingly prominent in the early detection of GI diseases. Capsule endoscopy is a non-invasive medical imaging technique used to examine the gastrointestinal tract. In our previous work, we developed a near-infrared fluorescence capsule endoscope (NIRF-CE) capable of exciting and capturing near-infrared (NIR) fluorescence images to specifically identify subtle mucosal microlesions and submucosal abnormalities while simultaneously capturing conventional white-light images to detect lesions with significant morphological changes. However, limitations such as low camera resolution and poor lighting within the gastrointestinal tract may lead to misdiagnosis and other medical errors. Manually reviewing and interpreting large volumes of capsule endoscopy images is time-consuming and prone to errors. Deep learning models have shown potential in automatically detecting abnormalities in NIRF-CE images. This study focuses on an improved deep learning model called Retinex-Attention-YOLO (RAY), which is based on single-modality image data and built on the YOLO series of object detection models. RAY enhances the accuracy and efficiency of anomaly detection, especially under low-light conditions. To further improve detection performance, we also propose a multimodal deep learning model, Multimodal-Retinex-Attention-YOLO (MRAY), which combines both white-light and fluorescence image data. The dataset used in this study consists of images of pig stomachs captured by our NIRF-CE system, simulating the human GI tract. In conjunction with a targeted fluorescent probe, which accumulates at lesion sites and releases fluorescent signals for imaging when abnormalities are present, a bright spot indicates a lesion. The MRAY model achieved an impressive precision of 96.3%, outperforming similar object detection models. To further validate the model's performance, ablation experiments were conducted, and comparisons were made with publicly available datasets. MRAY shows great promise for the automated detection of GI cancers, ulcers, inflammations, and other medical conditions in clinical practice.

Mixed Modality Detection Abdominal Methodology In Silico Academic Lab

A review: Lightweight architecture model in deep learning approach for lung disease identification.

Maharani DA, Utaminingrum F, Husnina DNN, Sukmaningrum B, Rahmania FN, Handani F, Chasanah HN, Arrahman A, Febrianto F

•papers•Jun 14 2025

As one of the leading causes of death worldwide, early detection of lung disease is a very important step to improve the effectiveness of treatment. By using medical image data, such as X-ray or CT-scan, classification of lung disease can be done. Deep learning methods have been widely used to recognize complex patterns in medical images, but this approach has the constraints of requiring large data variations and high computing resources. In overcoming these constraints, the lightweight architecture in deep learning can provide a more efficient solution based on the number of parameters and computing time. This method can be applied to devices with low processor specifications on portable devices such as mobile phones. This article presents a comprehensive review of 23 research studies published between 2020 and 2025, focusing on various lightweight architectures and optimization techniques aimed at improving the accuracy of lung disease detection. The results show that these models are able to significantly reduce parameter sizes, resulting in faster computation times while maintaining competitive accuracy compared to traditional deep learning architectures. From the research that has been done, it can be seen that SqueezeNet applied on public COVID-19 datasets is the best basic architecture with high accuracy, and the number of parameters is 570 thousand, which is very low. On the other hand, UNet requires 31.07 million parameters, and SegNet requires 29.45 million parameters trained on CT scan images from Italian Society of Medical and Interventional Radiology and Radiopedia, so it is less efficient. For the combination method, EfficientNetV2 and Extreme Learning Machine (ELM) are able to achieve the highest accuracy of 98.20 % and can significantly reduce parameters. The worst performance is shown by VGG and UNet with a decrease in accuracy from 91.05 % to 87 % and an increase in the number of parameters. It can be concluded that the lightweight architecture can be applied to medical image classification in the diagnosis of lung disease quickly and efficiently on devices with limited specifications.

Mixed Modality Classification Chest Review Concept Academic Lab

Filter Papers

Tags