Sort by:
Page 6 of 42417 results

MAUP: Training-free Multi-center Adaptive Uncertainty-aware Prompting for Cross-domain Few-shot Medical Image Segmentation

Yazhou Zhu, Haofeng Zhang

arxiv logopreprintAug 5 2025
Cross-domain Few-shot Medical Image Segmentation (CD-FSMIS) is a potential solution for segmenting medical images with limited annotation using knowledge from other domains. The significant performance of current CD-FSMIS models relies on the heavily training procedure over other source medical domains, which degrades the universality and ease of model deployment. With the development of large visual models of natural images, we propose a training-free CD-FSMIS model that introduces the Multi-center Adaptive Uncertainty-aware Prompting (MAUP) strategy for adapting the foundation model Segment Anything Model (SAM), which is trained with natural images, into the CD-FSMIS task. To be specific, MAUP consists of three key innovations: (1) K-means clustering based multi-center prompts generation for comprehensive spatial coverage, (2) uncertainty-aware prompts selection that focuses on the challenging regions, and (3) adaptive prompt optimization that can dynamically adjust according to the target region complexity. With the pre-trained DINOv2 feature encoder, MAUP achieves precise segmentation results across three medical datasets without any additional training compared with several conventional CD-FSMIS models and training-free FSMIS model. The source code is available at: https://github.com/YazhouZhu19/MAUP.

Are Vision-xLSTM-embedded U-Nets better at segmenting medical images?

Dutta P, Bose S, Roy SK, Mitra S

pubmed logopapersAug 5 2025
The development of efficient segmentation strategies for medical images has evolved from its initial dependence on Convolutional Neural Networks (CNNs) to the current investigation of hybrid models that combine CNNs with Vision Transformers (ViTs). There is an increasing focus on developing architectures that are both high-performing and computationally efficient, capable of being deployed on remote systems with limited resources. Although transformers can capture global dependencies in the input space, they face challenges from the corresponding high computational and storage expenses involved. The objective of this research is to propose that Vision Extended Long Short-Term Memory (Vision-xLSTM) forms an appropriate backbone for medical image segmentation, offering excellent performance with reduced computational costs. This study investigates the integration of CNNs with Vision-xLSTM by introducing the novel U-VixLSTM. The Vision-xLSTM blocks capture the temporal and global relationships within the patches extracted from the CNN feature maps. The convolutional feature reconstruction path upsamples the output volume from the Vision-xLSTM blocks to produce the segmentation output. The U-VixLSTM exhibits superior performance compared to the state-of-the-art networks in the publicly available Synapse, ISIC and ACDC datasets. The findings suggest that U-VixLSTM is a promising alternative to ViTs for medical image segmentation, delivering effective performance without substantial computational burden. This makes it feasible for deployment in healthcare environments with limited resources for faster diagnosis. Code provided: https://github.com/duttapallabi2907/U-VixLSTM.

R2GenKG: Hierarchical Multi-modal Knowledge Graph for LLM-based Radiology Report Generation

Futian Wang, Yuhan Qiao, Xiao Wang, Fuling Wang, Yuxiang Zhang, Dengdi Sun

arxiv logopreprintAug 5 2025
X-ray medical report generation is one of the important applications of artificial intelligence in healthcare. With the support of large foundation models, the quality of medical report generation has significantly improved. However, challenges such as hallucination and weak disease diagnostic capability still persist. In this paper, we first construct a large-scale multi-modal medical knowledge graph (termed M3KG) based on the ground truth medical report using the GPT-4o. It contains 2477 entities, 3 kinds of relations, 37424 triples, and 6943 disease-aware vision tokens for the CheXpert Plus dataset. Then, we sample it to obtain multi-granularity semantic graphs and use an R-GCN encoder for feature extraction. For the input X-ray image, we adopt the Swin-Transformer to extract the vision features and interact with the knowledge using cross-attention. The vision tokens are fed into a Q-former and retrieved the disease-aware vision tokens using another cross-attention. Finally, we adopt the large language model to map the semantic knowledge graph, input X-ray image, and disease-aware vision tokens into language descriptions. Extensive experiments on multiple datasets fully validated the effectiveness of our proposed knowledge graph and X-ray report generation framework. The source code of this paper will be released on https://github.com/Event-AHU/Medical_Image_Analysis.

Prediction of breast cancer HER2 status changes based on ultrasound radiomics attention network.

Liu J, Xue X, Yan Y, Song Q, Cheng Y, Wang L, Wang X, Xu D

pubmed logopapersAug 5 2025
Following Neoadjuvant Chemotherapy (NAC), there exists a probability of changes occurring in the Human Epidermal Growth Factor Receptor 2 (HER2) status. If these changes are not promptly addressed, it could hinder the timely adjustment of treatment plans, thereby affecting the optimal management of breast cancer. Consequently, the accurate prediction of HER2 status changes holds significant clinical value, underscoring the need for a model capable of precisely forecasting these alterations. In this paper, we elucidate the intricacies surrounding HER2 status changes, and propose a deep learning architecture combined with radiomics techniques, named as Ultrasound Radiomics Attention Network (URAN), to predict HER2 status changes. Firstly, radiomics technology is used to extract ultrasound image features to provide rich and comprehensive medical information. Secondly, HER2 Key Feature Selection (HKFS) network is constructed for retain crucial features relevant to HER2 status change. Thirdly, we design Max and Average Attention and Excitation (MAAE) network to adjust the model's focus on different key features. Finally, a fully connected neural network is utilized to predict HER2 status changes. The code to reproduce our experiments can be found at https://github.com/joanaapa/Foundation-Medical. Our research was carried out using genuine ultrasound images sourced from hospitals. On this dataset, URAN outperformed both state-of-the-art and traditional methods in predicting HER2 status changes, achieving an accuracy of 0.8679 and an AUC of 0.8328 (95% CI: 0.77-0.90). Comparative experiments on the public BUS_UCLM dataset further demonstrated URAN's superiority, attaining an accuracy of 0.9283 and an AUC of 0.9161 (95% CI: 0.91-0.92). Additionally, we undertook rigorously crafted ablation studies, which validated the logicality and effectiveness of the radiomics techniques, as well as the HKFS and MAAE modules integrated within the URAN model. The results pertaining to specific HER2 statuses indicate that URAN exhibits superior accuracy in predicting changes in HER2 status characterized by low expression and IHC scores of 2+ or below. Furthermore, we examined the radiomics attributes of ultrasound images and discovered that various wavelet transform features significantly impacted the changes in HER2 status. We have developed a URAN method for predicting HER2 status changes that combines radiomics techniques and deep learning. URAN model have better predictive performance compared to other competing algorithms, and can mine key radiomics features related to HER2 status changes.

MedCAL-Bench: A Comprehensive Benchmark on Cold-Start Active Learning with Foundation Models for Medical Image Analysis

Ning Zhu, Xiaochuan Ma, Shaoting Zhang, Guotai Wang

arxiv logopreprintAug 5 2025
Cold-Start Active Learning (CSAL) aims to select informative samples for annotation without prior knowledge, which is important for improving annotation efficiency and model performance under a limited annotation budget in medical image analysis. Most existing CSAL methods rely on Self-Supervised Learning (SSL) on the target dataset for feature extraction, which is inefficient and limited by insufficient feature representation. Recently, pre-trained Foundation Models (FMs) have shown powerful feature extraction ability with a potential for better CSAL. However, this paradigm has been rarely investigated, with a lack of benchmarks for comparison of FMs in CSAL tasks. To this end, we propose MedCAL-Bench, the first systematic FM-based CSAL benchmark for medical image analysis. We evaluate 14 FMs and 7 CSAL strategies across 7 datasets under different annotation budgets, covering classification and segmentation tasks from diverse medical modalities. It is also the first CSAL benchmark that evaluates both the feature extraction and sample selection stages. Our experimental results reveal that: 1) Most FMs are effective feature extractors for CSAL, with DINO family performing the best in segmentation; 2) The performance differences of these FMs are large in segmentation tasks, while small for classification; 3) Different sample selection strategies should be considered in CSAL on different datasets, with Active Learning by Processing Surprisal (ALPS) performing the best in segmentation while RepDiv leading for classification. The code is available at https://github.com/HiLab-git/MedCAL-Bench.

Joint Lossless Compression and Steganography for Medical Images via Large Language Models

Pengcheng Zheng, Xiaorong Pu, Kecheng Chen, Jiaxin Huang, Meng Yang, Bai Feng, Yazhou Ren, Jianan Jiang

arxiv logopreprintAug 3 2025
Recently, large language models (LLMs) have driven promis ing progress in lossless image compression. However, di rectly adopting existing paradigms for medical images suf fers from an unsatisfactory trade-off between compression performance and efficiency. Moreover, existing LLM-based compressors often overlook the security of the compres sion process, which is critical in modern medical scenarios. To this end, we propose a novel joint lossless compression and steganography framework. Inspired by bit plane slicing (BPS), we find it feasible to securely embed privacy messages into medical images in an invisible manner. Based on this in sight, an adaptive modalities decomposition strategy is first devised to partition the entire image into two segments, pro viding global and local modalities for subsequent dual-path lossless compression. During this dual-path stage, we inno vatively propose a segmented message steganography algo rithm within the local modality path to ensure the security of the compression process. Coupled with the proposed anatom ical priors-based low-rank adaptation (A-LoRA) fine-tuning strategy, extensive experimental results demonstrate the su periority of our proposed method in terms of compression ra tios, efficiency, and security. The source code will be made publicly available.

External evaluation of an open-source deep learning model for prostate cancer detection on bi-parametric MRI.

Johnson PM, Tong A, Ginocchio L, Del Hoyo JL, Smereka P, Harmon SA, Turkbey B, Chandarana H

pubmed logopapersAug 3 2025
This study aims to evaluate the diagnostic accuracy of an open-source deep learning (DL) model for detecting clinically significant prostate cancer (csPCa) in biparametric MRI (bpMRI). It also aims to outline the necessary components of the model that facilitate effective sharing and external evaluation of PCa detection models. This retrospective diagnostic accuracy study evaluated a publicly available DL model trained to detect PCa on bpMRI. External validation was performed on bpMRI exams from 151 biologically male patients (mean age, 65 ± 8 years). The model's performance was evaluated using patient-level classification of PCa with both radiologist interpretation and histopathology serving as the ground truth. The model processed bpMRI inputs to generate lesion probability maps. Performance was assessed using the area under the receiver operating characteristic curve (AUC) for PI-RADS ≥ 3, PI-RADS ≥ 4, and csPCa (defined as Gleason ≥ 7) at an exam level. The model achieved AUCs of 0.86 (95% CI: 0.80-0.92) and 0.91 (95% CI: 0.85-0.96) for predicting PI-RADS ≥ 3 and ≥ 4 exams, respectively, and 0.78 (95% CI: 0.71-0.86) for csPCa. Sensitivity and specificity for csPCa were 0.87 and 0.53, respectively. Fleiss' kappa for inter-reader agreement was 0.51. The open-source DL model offers high sensitivity to clinically significant prostate cancer. The study underscores the importance of sharing model code and weights to enable effective external validation and further research. Question Inter-reader variability hinders the consistent and accurate detection of clinically significant prostate cancer in MRI. Findings An open-source deep learning model demonstrated reproducible diagnostic accuracy, achieving AUCs of 0.86 for PI-RADS ≥ 3 and 0.78 for CsPCa lesions. Clinical relevance The model's high sensitivity for MRI-positive lesions (PI-RADS ≥ 3) may provide support for radiologists. Its open-source deployment facilitates further development and evaluation across diverse clinical settings, maximizing its potential utility.

M$^3$AD: Multi-task Multi-gate Mixture of Experts for Alzheimer's Disease Diagnosis with Conversion Pattern Modeling

Yufeng Jiang, Hexiao Ding, Hongzhao Chen, Jing Lan, Xinzhi Teng, Gerald W. Y. Cheng, Zongxi Li, Haoran Xie, Jung Sun Yoo, Jing Cai

arxiv logopreprintAug 3 2025
Alzheimer's disease (AD) progression follows a complex continuum from normal cognition (NC) through mild cognitive impairment (MCI) to dementia, yet most deep learning approaches oversimplify this into discrete classification tasks. This study introduces M$^3$AD, a novel multi-task multi-gate mixture of experts framework that jointly addresses diagnostic classification and cognitive transition modeling using structural MRI. We incorporate three key innovations: (1) an open-source T1-weighted sMRI preprocessing pipeline, (2) a unified learning framework capturing NC-MCI-AD transition patterns with demographic priors (age, gender, brain volume) for improved generalization, and (3) a customized multi-gate mixture of experts architecture enabling effective multi-task learning with structural MRI alone. The framework employs specialized expert networks for diagnosis-specific pathological patterns while shared experts model common structural features across the cognitive continuum. A two-stage training protocol combines SimMIM pretraining with multi-task fine-tuning for joint optimization. Comprehensive evaluation across six datasets comprising 12,037 T1-weighted sMRI scans demonstrates superior performance: 95.13% accuracy for three-class NC-MCI-AD classification and 99.15% for binary NC-AD classification, representing improvements of 4.69% and 0.55% over state-of-the-art approaches. The multi-task formulation simultaneously achieves 97.76% accuracy in predicting cognitive transition. Our framework outperforms existing methods using fewer modalities and offers a clinically practical solution for early intervention. Code: https://github.com/csyfjiang/M3AD.

LesiOnTime -- Joint Temporal and Clinical Modeling for Small Breast Lesion Segmentation in Longitudinal DCE-MRI

Mohammed Kamran, Maria Bernathova, Raoul Varga, Christian Singer, Zsuzsanna Bago-Horvath, Thomas Helbich, Georg Langs, Philipp Seeböck

arxiv logopreprintAug 1 2025
Accurate segmentation of small lesions in Breast Dynamic Contrast-Enhanced MRI (DCE-MRI) is critical for early cancer detection, especially in high-risk patients. While recent deep learning methods have advanced lesion segmentation, they primarily target large lesions and neglect valuable longitudinal and clinical information routinely used by radiologists. In real-world screening, detecting subtle or emerging lesions requires radiologists to compare across timepoints and consider previous radiology assessments, such as the BI-RADS score. We propose LesiOnTime, a novel 3D segmentation approach that mimics clinical diagnostic workflows by jointly leveraging longitudinal imaging and BIRADS scores. The key components are: (1) a Temporal Prior Attention (TPA) block that dynamically integrates information from previous and current scans; and (2) a BI-RADS Consistency Regularization (BCR) loss that enforces latent space alignment for scans with similar radiological assessments, thus embedding domain knowledge into the training process. Evaluated on a curated in-house longitudinal dataset of high-risk patients with DCE-MRI, our approach outperforms state-of-the-art single-timepoint and longitudinal baselines by 5% in terms of Dice. Ablation studies demonstrate that both TPA and BCR contribute complementary performance gains. These results highlight the importance of incorporating temporal and clinical context for reliable early lesion segmentation in real-world breast cancer screening. Our code is publicly available at https://github.com/cirmuw/LesiOnTime

LesiOnTime -- Joint Temporal and Clinical Modeling for Small Breast Lesion Segmentation in Longitudinal DCE-MRI

Mohammed Kamran, Maria Bernathova, Raoul Varga, Christian F. Singer, Zsuzsanna Bago-Horvath, Thomas Helbich, Georg Langs, Philipp Seeböck

arxiv logopreprintAug 1 2025
Accurate segmentation of small lesions in Breast Dynamic Contrast-Enhanced MRI (DCE-MRI) is critical for early cancer detection, especially in high-risk patients. While recent deep learning methods have advanced lesion segmentation, they primarily target large lesions and neglect valuable longitudinal and clinical information routinely used by radiologists. In real-world screening, detecting subtle or emerging lesions requires radiologists to compare across timepoints and consider previous radiology assessments, such as the BI-RADS score. We propose LesiOnTime, a novel 3D segmentation approach that mimics clinical diagnostic workflows by jointly leveraging longitudinal imaging and BIRADS scores. The key components are: (1) a Temporal Prior Attention (TPA) block that dynamically integrates information from previous and current scans; and (2) a BI-RADS Consistency Regularization (BCR) loss that enforces latent space alignment for scans with similar radiological assessments, thus embedding domain knowledge into the training process. Evaluated on a curated in-house longitudinal dataset of high-risk patients with DCE-MRI, our approach outperforms state-of-the-art single-timepoint and longitudinal baselines by 5% in terms of Dice. Ablation studies demonstrate that both TPA and BCR contribute complementary performance gains. These results highlight the importance of incorporating temporal and clinical context for reliable early lesion segmentation in real-world breast cancer screening. Our code is publicly available at https://github.com/cirmuw/LesiOnTime
Page 6 of 42417 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.