Sort by:
Page 5 of 100998 results

MIND: A Noise-Adaptive Denoising Framework for Medical Images Integrating Multi-Scale Transformer

Tao Tang, Chengxu Yang

arxiv logopreprintAug 11 2025
The core role of medical images in disease diagnosis makes their quality directly affect the accuracy of clinical judgment. However, due to factors such as low-dose scanning, equipment limitations and imaging artifacts, medical images are often accompanied by non-uniform noise interference, which seriously affects structure recognition and lesion detection. This paper proposes a medical image adaptive denoising model (MI-ND) that integrates multi-scale convolutional and Transformer architecture, introduces a noise level estimator (NLE) and a noise adaptive attention module (NAAB), and realizes channel-spatial attention regulation and cross-modal feature fusion driven by noise perception. Systematic testing is carried out on multimodal public datasets. Experiments show that this method significantly outperforms the comparative methods in image quality indicators such as PSNR, SSIM, and LPIPS, and improves the F1 score and ROC-AUC in downstream diagnostic tasks, showing strong prac-tical value and promotional potential. The model has outstanding benefits in structural recovery, diagnostic sensitivity, and cross-modal robustness, and provides an effective solution for medical image enhancement and AI-assisted diagnosis and treatment.

Improving discriminative ability in mammographic microcalcification classification using deep learning: a novel double transfer learning approach validated with an explainable artificial intelligence technique

Arlan, K., Bjornstrom, M., Makela, T., Meretoja, T. J., Hukkinen, K.

medrxiv logopreprintAug 11 2025
BackgroundBreast microcalcification diagnostics are challenging due to their subtle presentation, overlapping with benign findings, and high inter-reader variability, often leading to unnecessary biopsies. While deep learning (DL) models - particularly deep convolutional neural networks (DCNNs) - have shown potential to improve diagnostic accuracy, their clinical application remains limited by the need for large annotated datasets and the "black box" nature of their decision-making. PurposeTo develop and validate a deep learning model (DCNN) using a double transfer learning (d-TL) strategy for classifying suspected mammographic microcalcifications, with explainable AI (XAI) techniques to support model interpretability. Material and methodsA retrospective dataset of 396 annotated regions of interest (ROIs) from full-field digital mammography (FFDM) images of 194 patients who underwent stereotactic vacuum-assisted biopsy at the Womens Hospital radiological department, Helsinki University Hospital, was collected. The dataset was randomly split into training and test sets (24% test set, balanced for benign and malignant cases). A ResNeXt-based DCNN was developed using a d-TL approach: first pretrained on ImageNet, then adapted using an intermediate mammography dataset before fine-tuning on the target microcalcification data. Saliency maps were generated using Gradient-weighted Class Activation Mapping (Grad-CAM) to evaluate the visual relevance of model predictions. Diagnostic performance was compared to a radiologists BI-RADS-based assessment, using final histopathology as the reference standard. ResultsThe ensemble DCNN achieved an area under the ROC curve (AUC) of 0.76, with 65% sensitivity, 83% specificity, 79% positive predictive value (PPV), and 70% accuracy. The radiologist achieved an AUC of 0.65 with 100% sensitivity but lower specificity (30%) and PPV (59%). Grad-CAM visualizations showed consistent activation of the correct ROIs, even in misclassified cases where confidence scores fell below the threshold. ConclusionThe DCNN model utilizing d-TL achieved performance comparable to radiologists, with higher specificity and PPV than BI-RADS. The approach addresses data limitation issues and may help reduce additional imaging and unnecessary biopsies.

Adapting Biomedical Foundation Models for Predicting Outcomes of Anti Seizure Medications

Pham, D. K., Mehta, D., Jiang, Y., Thom, D., Chang, R. S.-k., Foster, E., Fazio, T., Holper, S., Verspoor, K., Liu, J., Nhu, D., Barnard, S., O'Brien, T., Chen, Z., French, J., Kwan, P., Ge, Z.

medrxiv logopreprintAug 11 2025
Epilepsy affects over 50 million people worldwide, with anti-seizure medications (ASMs) as the primary treatment for seizure control. However, ASM selection remains a "trial and error" process due to the lack of reliable predictors of effectiveness and tolerability. While machine learning approaches have been explored, existing models are limited to predicting outcomes only for ASMs encountered during training and have not leveraged recent biomedical foundation models for this task. This work investigates ASM outcome prediction using only patient MRI scans and reports. Specifically, we leverage biomedical vision-language foundation models and introduce a novel contextualized instruction-tuning framework that integrates expert-built knowledge trees of MRI entities to enhance their performance. Additionally, by training only on the four most commonly prescribed ASMs, our framework enables generalization to predicting outcomes and effectiveness for unseen ASMs not present during training. We evaluate our instruction-tuning framework on two retrospective epilepsy patient datasets, achieving an average AUC of 71.39 and 63.03 in predicting outcomes for four primary ASMs and three completely unseen ASMs, respectively. Our approach improves the AUC by 5.53 and 3.51 compared to standard report-based instruction tuning for seen and unseen ASMs, respectively. Our code, MRI knowledge tree, prompting templates, and TREE-TUNE generated instruction-answer tuning dataset are available at the link.

A Systematic Review of Multimodal Deep Learning and Machine Learning Fusion Techniques for Prostate Cancer Classification

Manzoor, F., Gupta, V., Pinky, L., Wang, Z., Chen, Z., Deng, Y., Neupane, S.

medrxiv logopreprintAug 11 2025
Prostate cancer remains one of the most prevalent malignancies and a leading cause of cancer-related deaths among men worldwide. Despite advances in traditional diagnostic methods such as Prostate-specific antigen testing, digital rectal examination, and multiparametric Magnetic resonance imaging, these approaches remain constrained by modality-specific limitations, suboptimal sensitivity and specificity, and reliance on expert interpretation, which may introduce diagnostic inconsistency. Multimodal deep learning and machine learning fusion, which integrates diverse data sources including imaging, clinical, and molecular information, has emerged as a promising strategy to enhance the accuracy of prostate cancer classification. This review aims to outline the current state-of-the-art deep learning and machine learning based fusion techniques for prostate cancer classification, focusing on their implementation, performance, challenges, and clinical applicability. Following the PRISMA guidelines, a total of 131 studies were identified, of which 27 met the inclusion criteria for studies published between 2021 and 2025. Extracted data included input techniques, deep learning architectures, performance metrics, and validation approaches. The majority of the studies used an early fusion approach with convolutional neural networks to integrate the data. Clinical and imaging data were the most commonly used modalities in the reviewed studies for prostate cancer research. Overall, multimodal deep learning and machine learning-based fusion significantly advances prostate cancer classification and outperform unimodal approaches.

Perceptual Evaluation of GANs and Diffusion Models for Generating X-rays

Gregory Schuit, Denis Parra, Cecilia Besa

arxiv logopreprintAug 10 2025
Generative image models have achieved remarkable progress in both natural and medical imaging. In the medical context, these techniques offer a potential solution to data scarcity-especially for low-prevalence anomalies that impair the performance of AI-driven diagnostic and segmentation tools. However, questions remain regarding the fidelity and clinical utility of synthetic images, since poor generation quality can undermine model generalizability and trust. In this study, we evaluate the effectiveness of state-of-the-art generative models-Generative Adversarial Networks (GANs) and Diffusion Models (DMs)-for synthesizing chest X-rays conditioned on four abnormalities: Atelectasis (AT), Lung Opacity (LO), Pleural Effusion (PE), and Enlarged Cardiac Silhouette (ECS). Using a benchmark composed of real images from the MIMIC-CXR dataset and synthetic images from both GANs and DMs, we conducted a reader study with three radiologists of varied experience. Participants were asked to distinguish real from synthetic images and assess the consistency between visual features and the target abnormality. Our results show that while DMs generate more visually realistic images overall, GANs can report better accuracy for specific conditions, such as absence of ECS. We further identify visual cues radiologists use to detect synthetic images, offering insights into the perceptual gaps in current models. These findings underscore the complementary strengths of GANs and DMs and point to the need for further refinement to ensure generative models can reliably augment training datasets for AI diagnostic systems.

SynMatch: Rethinking Consistency in Medical Image Segmentation with Sparse Annotations

Zhiqiang Shen, Peng Cao, Xiaoli Liu, Jinzhu Yang, Osmar R. Zaiane

arxiv logopreprintAug 10 2025
Label scarcity remains a major challenge in deep learning-based medical image segmentation. Recent studies use strong-weak pseudo supervision to leverage unlabeled data. However, performance is often hindered by inconsistencies between pseudo labels and their corresponding unlabeled images. In this work, we propose \textbf{SynMatch}, a novel framework that sidesteps the need for improving pseudo labels by synthesizing images to match them instead. Specifically, SynMatch synthesizes images using texture and shape features extracted from the same segmentation model that generates the corresponding pseudo labels for unlabeled images. This design enables the generation of highly consistent synthesized-image-pseudo-label pairs without requiring any training parameters for image synthesis. We extensively evaluate SynMatch across diverse medical image segmentation tasks under semi-supervised learning (SSL), weakly-supervised learning (WSL), and barely-supervised learning (BSL) settings with increasingly limited annotations. The results demonstrate that SynMatch achieves superior performance, especially in the most challenging BSL setting. For example, it outperforms the recent strong-weak pseudo supervision-based method by 29.71\% and 10.05\% on the polyp segmentation task with 5\% and 10\% scribble annotations, respectively. The code will be released at https://github.com/Senyh/SynMatch.

Large-scale Multi-sequence Pretraining for Generalizable MRI Analysis in Versatile Clinical Applications

Zelin Qiu, Xi Wang, Zhuoyao Xie, Juan Zhou, Yu Wang, Lingjie Yang, Xinrui Jiang, Juyoung Bae, Moo Hyun Son, Qiang Ye, Dexuan Chen, Rui Zhang, Tao Li, Neeraj Ramesh Mahboobani, Varut Vardhanabhuti, Xiaohui Duan, Yinghua Zhao, Hao Chen

arxiv logopreprintAug 10 2025
Multi-sequence Magnetic Resonance Imaging (MRI) offers remarkable versatility, enabling the distinct visualization of different tissue types. Nevertheless, the inherent heterogeneity among MRI sequences poses significant challenges to the generalization capability of deep learning models. These challenges undermine model performance when faced with varying acquisition parameters, thereby severely restricting their clinical utility. In this study, we present PRISM, a foundation model PRe-trained with large-scale multI-Sequence MRI. We collected a total of 64 datasets from both public and private sources, encompassing a wide range of whole-body anatomical structures, with scans spanning diverse MRI sequences. Among them, 336,476 volumetric MRI scans from 34 datasets (8 public and 26 private) were curated to construct the largest multi-organ multi-sequence MRI pretraining corpus to date. We propose a novel pretraining paradigm that disentangles anatomically invariant features from sequence-specific variations in MRI, while preserving high-level semantic representations. We established a benchmark comprising 44 downstream tasks, including disease diagnosis, image segmentation, registration, progression prediction, and report generation. These tasks were evaluated on 32 public datasets and 5 private cohorts. PRISM consistently outperformed both non-pretrained models and existing foundation models, achieving first-rank results in 39 out of 44 downstream benchmarks with statistical significance improvements. These results underscore its ability to learn robust and generalizable representations across unseen data acquired under diverse MRI protocols. PRISM provides a scalable framework for multi-sequence MRI analysis, thereby enhancing the translational potential of AI in radiology. It delivers consistent performance across diverse imaging protocols, reinforcing its clinical applicability.

The eyelid and pupil dynamics underlying stress levels in awake mice.

Zeng, H.

biorxiv logopreprintAug 10 2025
Stress is a natural response of the body to perceived threats, and it can have both positive and negative effects on brain hemodynamics. Stress-induced changes in pupil and eyelid size/shape have been used as a biomarker in several fMRI studies. However, there were limited knowledges regarding changes in behavior of pupil and eyelid dynamics, particularly on animal models. In the present study, the pupil and eyelid dynamics were carefully investigated and characterized in a newly developed awake rodent fMRI protocol. Leveraging deep learning techniques, the mouse pupil and eyelid diameters were extracted and analyzed during different training and imaging phases in the present project. Our findings demonstrate a consistent downwards trend in pupil and eyelid dynamics under a meticulously designed training protocol, suggesting that the behaviors of the pupil and eyelid can be served as reliable indicators of stress levels and motion artifacts in awake fMRI studies. The current recording platform not only enables the facilitation of awake animal MRI studies but also highlights its potential applications to numerous other research areas, owing to the non-invasive nature and straightforward implementation.

Trustworthy Medical Imaging with Large Language Models: A Study of Hallucinations Across Modalities

Anindya Bijoy Das, Shahnewaz Karim Sakib, Shibbir Ahmed

arxiv logopreprintAug 9 2025
Large Language Models (LLMs) are increasingly applied to medical imaging tasks, including image interpretation and synthetic image generation. However, these models often produce hallucinations, which are confident but incorrect outputs that can mislead clinical decisions. This study examines hallucinations in two directions: image to text, where LLMs generate reports from X-ray, CT, or MRI scans, and text to image, where models create medical images from clinical prompts. We analyze errors such as factual inconsistencies and anatomical inaccuracies, evaluating outputs using expert informed criteria across imaging modalities. Our findings reveal common patterns of hallucination in both interpretive and generative tasks, with implications for clinical reliability. We also discuss factors contributing to these failures, including model architecture and training data. By systematically studying both image understanding and generation, this work provides insights into improving the safety and trustworthiness of LLM driven medical imaging systems.

BrainATCL: Adaptive Temporal Brain Connectivity Learning for Functional Link Prediction and Age Estimation

Yiran Huang, Amirhossein Nouranizadeh, Christine Ahrends, Mengjia Xu

arxiv logopreprintAug 9 2025
Functional Magnetic Resonance Imaging (fMRI) is an imaging technique widely used to study human brain activity. fMRI signals in areas across the brain transiently synchronise and desynchronise their activity in a highly structured manner, even when an individual is at rest. These functional connectivity dynamics may be related to behaviour and neuropsychiatric disease. To model these dynamics, temporal brain connectivity representations are essential, as they reflect evolving interactions between brain regions and provide insight into transient neural states and network reconfigurations. However, conventional graph neural networks (GNNs) often struggle to capture long-range temporal dependencies in dynamic fMRI data. To address this challenge, we propose BrainATCL, an unsupervised, nonparametric framework for adaptive temporal brain connectivity learning, enabling functional link prediction and age estimation. Our method dynamically adjusts the lookback window for each snapshot based on the rate of newly added edges. Graph sequences are subsequently encoded using a GINE-Mamba2 backbone to learn spatial-temporal representations of dynamic functional connectivity in resting-state fMRI data of 1,000 participants from the Human Connectome Project. To further improve spatial modeling, we incorporate brain structure and function-informed edge attributes, i.e., the left/right hemispheric identity and subnetwork membership of brain regions, enabling the model to capture biologically meaningful topological patterns. We evaluate our BrainATCL on two tasks: functional link prediction and age estimation. The experimental results demonstrate superior performance and strong generalization, including in cross-session prediction scenarios.
Page 5 of 100998 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.