Sort by:
Page 24 of 60594 results

MAUP: Training-free Multi-center Adaptive Uncertainty-aware Prompting for Cross-domain Few-shot Medical Image Segmentation

Yazhou Zhu, Haofeng Zhang

arxiv logopreprintAug 5 2025
Cross-domain Few-shot Medical Image Segmentation (CD-FSMIS) is a potential solution for segmenting medical images with limited annotation using knowledge from other domains. The significant performance of current CD-FSMIS models relies on the heavily training procedure over other source medical domains, which degrades the universality and ease of model deployment. With the development of large visual models of natural images, we propose a training-free CD-FSMIS model that introduces the Multi-center Adaptive Uncertainty-aware Prompting (MAUP) strategy for adapting the foundation model Segment Anything Model (SAM), which is trained with natural images, into the CD-FSMIS task. To be specific, MAUP consists of three key innovations: (1) K-means clustering based multi-center prompts generation for comprehensive spatial coverage, (2) uncertainty-aware prompts selection that focuses on the challenging regions, and (3) adaptive prompt optimization that can dynamically adjust according to the target region complexity. With the pre-trained DINOv2 feature encoder, MAUP achieves precise segmentation results across three medical datasets without any additional training compared with several conventional CD-FSMIS models and training-free FSMIS model. The source code is available at: https://github.com/YazhouZhu19/MAUP.

Augmenting Continual Learning of Diseases with LLM-Generated Visual Concepts

Jiantao Tan, Peixian Ma, Kanghao Chen, Zhiming Dai, Ruixuan Wang

arxiv logopreprintAug 5 2025
Continual learning is essential for medical image classification systems to adapt to dynamically evolving clinical environments. The integration of multimodal information can significantly enhance continual learning of image classes. However, while existing approaches do utilize textual modality information, they solely rely on simplistic templates with a class name, thereby neglecting richer semantic information. To address these limitations, we propose a novel framework that harnesses visual concepts generated by large language models (LLMs) as discriminative semantic guidance. Our method dynamically constructs a visual concept pool with a similarity-based filtering mechanism to prevent redundancy. Then, to integrate the concepts into the continual learning process, we employ a cross-modal image-concept attention module, coupled with an attention loss. Through attention, the module can leverage the semantic knowledge from relevant visual concepts and produce class-representative fused features for classification. Experiments on medical and natural image datasets show our method achieves state-of-the-art performance, demonstrating the effectiveness and superiority of our method. We will release the code publicly.

Joint Lossless Compression and Steganography for Medical Images via Large Language Models

Pengcheng Zheng, Xiaorong Pu, Kecheng Chen, Jiaxin Huang, Meng Yang, Bai Feng, Yazhou Ren, Jianan Jiang

arxiv logopreprintAug 3 2025
Recently, large language models (LLMs) have driven promis ing progress in lossless image compression. However, di rectly adopting existing paradigms for medical images suf fers from an unsatisfactory trade-off between compression performance and efficiency. Moreover, existing LLM-based compressors often overlook the security of the compres sion process, which is critical in modern medical scenarios. To this end, we propose a novel joint lossless compression and steganography framework. Inspired by bit plane slicing (BPS), we find it feasible to securely embed privacy messages into medical images in an invisible manner. Based on this in sight, an adaptive modalities decomposition strategy is first devised to partition the entire image into two segments, pro viding global and local modalities for subsequent dual-path lossless compression. During this dual-path stage, we inno vatively propose a segmented message steganography algo rithm within the local modality path to ensure the security of the compression process. Coupled with the proposed anatom ical priors-based low-rank adaptation (A-LoRA) fine-tuning strategy, extensive experimental results demonstrate the su periority of our proposed method in terms of compression ra tios, efficiency, and security. The source code will be made publicly available.

External evaluation of an open-source deep learning model for prostate cancer detection on bi-parametric MRI.

Johnson PM, Tong A, Ginocchio L, Del Hoyo JL, Smereka P, Harmon SA, Turkbey B, Chandarana H

pubmed logopapersAug 3 2025
This study aims to evaluate the diagnostic accuracy of an open-source deep learning (DL) model for detecting clinically significant prostate cancer (csPCa) in biparametric MRI (bpMRI). It also aims to outline the necessary components of the model that facilitate effective sharing and external evaluation of PCa detection models. This retrospective diagnostic accuracy study evaluated a publicly available DL model trained to detect PCa on bpMRI. External validation was performed on bpMRI exams from 151 biologically male patients (mean age, 65 ± 8 years). The model's performance was evaluated using patient-level classification of PCa with both radiologist interpretation and histopathology serving as the ground truth. The model processed bpMRI inputs to generate lesion probability maps. Performance was assessed using the area under the receiver operating characteristic curve (AUC) for PI-RADS ≥ 3, PI-RADS ≥ 4, and csPCa (defined as Gleason ≥ 7) at an exam level. The model achieved AUCs of 0.86 (95% CI: 0.80-0.92) and 0.91 (95% CI: 0.85-0.96) for predicting PI-RADS ≥ 3 and ≥ 4 exams, respectively, and 0.78 (95% CI: 0.71-0.86) for csPCa. Sensitivity and specificity for csPCa were 0.87 and 0.53, respectively. Fleiss' kappa for inter-reader agreement was 0.51. The open-source DL model offers high sensitivity to clinically significant prostate cancer. The study underscores the importance of sharing model code and weights to enable effective external validation and further research. Question Inter-reader variability hinders the consistent and accurate detection of clinically significant prostate cancer in MRI. Findings An open-source deep learning model demonstrated reproducible diagnostic accuracy, achieving AUCs of 0.86 for PI-RADS ≥ 3 and 0.78 for CsPCa lesions. Clinical relevance The model's high sensitivity for MRI-positive lesions (PI-RADS ≥ 3) may provide support for radiologists. Its open-source deployment facilitates further development and evaluation across diverse clinical settings, maximizing its potential utility.

M$^3$AD: Multi-task Multi-gate Mixture of Experts for Alzheimer's Disease Diagnosis with Conversion Pattern Modeling

Yufeng Jiang, Hexiao Ding, Hongzhao Chen, Jing Lan, Xinzhi Teng, Gerald W. Y. Cheng, Zongxi Li, Haoran Xie, Jung Sun Yoo, Jing Cai

arxiv logopreprintAug 3 2025
Alzheimer's disease (AD) progression follows a complex continuum from normal cognition (NC) through mild cognitive impairment (MCI) to dementia, yet most deep learning approaches oversimplify this into discrete classification tasks. This study introduces M$^3$AD, a novel multi-task multi-gate mixture of experts framework that jointly addresses diagnostic classification and cognitive transition modeling using structural MRI. We incorporate three key innovations: (1) an open-source T1-weighted sMRI preprocessing pipeline, (2) a unified learning framework capturing NC-MCI-AD transition patterns with demographic priors (age, gender, brain volume) for improved generalization, and (3) a customized multi-gate mixture of experts architecture enabling effective multi-task learning with structural MRI alone. The framework employs specialized expert networks for diagnosis-specific pathological patterns while shared experts model common structural features across the cognitive continuum. A two-stage training protocol combines SimMIM pretraining with multi-task fine-tuning for joint optimization. Comprehensive evaluation across six datasets comprising 12,037 T1-weighted sMRI scans demonstrates superior performance: 95.13% accuracy for three-class NC-MCI-AD classification and 99.15% for binary NC-AD classification, representing improvements of 4.69% and 0.55% over state-of-the-art approaches. The multi-task formulation simultaneously achieves 97.76% accuracy in predicting cognitive transition. Our framework outperforms existing methods using fewer modalities and offers a clinically practical solution for early intervention. Code: https://github.com/csyfjiang/M3AD.

Mobile U-ViT: Revisiting large kernel and U-shaped ViT for efficient medical image segmentation

Fenghe Tang, Bingkun Nian, Jianrui Ding, Wenxin Ma, Quan Quan, Chengqi Dong, Jie Yang, Wei Liu, S. Kevin Zhou

arxiv logopreprintAug 1 2025
In clinical practice, medical image analysis often requires efficient execution on resource-constrained mobile devices. However, existing mobile models-primarily optimized for natural images-tend to perform poorly on medical tasks due to the significant information density gap between natural and medical domains. Combining computational efficiency with medical imaging-specific architectural advantages remains a challenge when developing lightweight, universal, and high-performing networks. To address this, we propose a mobile model called Mobile U-shaped Vision Transformer (Mobile U-ViT) tailored for medical image segmentation. Specifically, we employ the newly purposed ConvUtr as a hierarchical patch embedding, featuring a parameter-efficient large-kernel CNN with inverted bottleneck fusion. This design exhibits transformer-like representation learning capacity while being lighter and faster. To enable efficient local-global information exchange, we introduce a novel Large-kernel Local-Global-Local (LGL) block that effectively balances the low information density and high-level semantic discrepancy of medical images. Finally, we incorporate a shallow and lightweight transformer bottleneck for long-range modeling and employ a cascaded decoder with downsample skip connections for dense prediction. Despite its reduced computational demands, our medical-optimized architecture achieves state-of-the-art performance across eight public 2D and 3D datasets covering diverse imaging modalities, including zero-shot testing on four unseen datasets. These results establish it as an efficient yet powerful and generalization solution for mobile medical image analysis. Code is available at https://github.com/FengheTan9/Mobile-U-ViT.

LesiOnTime -- Joint Temporal and Clinical Modeling for Small Breast Lesion Segmentation in Longitudinal DCE-MRI

Mohammed Kamran, Maria Bernathova, Raoul Varga, Christian F. Singer, Zsuzsanna Bago-Horvath, Thomas Helbich, Georg Langs, Philipp Seeböck

arxiv logopreprintAug 1 2025
Accurate segmentation of small lesions in Breast Dynamic Contrast-Enhanced MRI (DCE-MRI) is critical for early cancer detection, especially in high-risk patients. While recent deep learning methods have advanced lesion segmentation, they primarily target large lesions and neglect valuable longitudinal and clinical information routinely used by radiologists. In real-world screening, detecting subtle or emerging lesions requires radiologists to compare across timepoints and consider previous radiology assessments, such as the BI-RADS score. We propose LesiOnTime, a novel 3D segmentation approach that mimics clinical diagnostic workflows by jointly leveraging longitudinal imaging and BIRADS scores. The key components are: (1) a Temporal Prior Attention (TPA) block that dynamically integrates information from previous and current scans; and (2) a BI-RADS Consistency Regularization (BCR) loss that enforces latent space alignment for scans with similar radiological assessments, thus embedding domain knowledge into the training process. Evaluated on a curated in-house longitudinal dataset of high-risk patients with DCE-MRI, our approach outperforms state-of-the-art single-timepoint and longitudinal baselines by 5% in terms of Dice. Ablation studies demonstrate that both TPA and BCR contribute complementary performance gains. These results highlight the importance of incorporating temporal and clinical context for reliable early lesion segmentation in real-world breast cancer screening. Our code is publicly available at https://github.com/cirmuw/LesiOnTime

LesiOnTime -- Joint Temporal and Clinical Modeling for Small Breast Lesion Segmentation in Longitudinal DCE-MRI

Mohammed Kamran, Maria Bernathova, Raoul Varga, Christian Singer, Zsuzsanna Bago-Horvath, Thomas Helbich, Georg Langs, Philipp Seeböck

arxiv logopreprintAug 1 2025
Accurate segmentation of small lesions in Breast Dynamic Contrast-Enhanced MRI (DCE-MRI) is critical for early cancer detection, especially in high-risk patients. While recent deep learning methods have advanced lesion segmentation, they primarily target large lesions and neglect valuable longitudinal and clinical information routinely used by radiologists. In real-world screening, detecting subtle or emerging lesions requires radiologists to compare across timepoints and consider previous radiology assessments, such as the BI-RADS score. We propose LesiOnTime, a novel 3D segmentation approach that mimics clinical diagnostic workflows by jointly leveraging longitudinal imaging and BIRADS scores. The key components are: (1) a Temporal Prior Attention (TPA) block that dynamically integrates information from previous and current scans; and (2) a BI-RADS Consistency Regularization (BCR) loss that enforces latent space alignment for scans with similar radiological assessments, thus embedding domain knowledge into the training process. Evaluated on a curated in-house longitudinal dataset of high-risk patients with DCE-MRI, our approach outperforms state-of-the-art single-timepoint and longitudinal baselines by 5% in terms of Dice. Ablation studies demonstrate that both TPA and BCR contribute complementary performance gains. These results highlight the importance of incorporating temporal and clinical context for reliable early lesion segmentation in real-world breast cancer screening. Our code is publicly available at https://github.com/cirmuw/LesiOnTime

SAM-Med3D: A Vision Foundation Model for General-Purpose Segmentation on Volumetric Medical Images.

Wang H, Guo S, Ye J, Deng Z, Cheng J, Li T, Chen J, Su Y, Huang Z, Shen Y, zzzzFu B, Zhang S, He J

pubmed logopapersJul 31 2025
Existing volumetric medical image segmentation models are typically task-specific, excelling at specific targets but struggling to generalize across anatomical structures or modalities. This limitation restricts their broader clinical use. In this article, we introduce segment anything model (SAM)-Med3D, a vision foundation model (VFM) for general-purpose segmentation on volumetric medical images. Given only a few 3-D prompt points, SAM-Med3D can accurately segment diverse anatomical structures and lesions across various modalities. To achieve this, we gather and preprocess a large-scale 3-D medical image segmentation dataset, SA-Med3D-140K, from 70 public datasets and 8K licensed private cases from hospitals. This dataset includes 22K 3-D images and 143K corresponding masks. SAM-Med3D, a promptable segmentation model characterized by its fully learnable 3-D structure, is trained on this dataset using a two-stage procedure and exhibits impressive performance on both seen and unseen segmentation targets. We comprehensively evaluate SAM-Med3D on 16 datasets covering diverse medical scenarios, including different anatomical structures, modalities, targets, and zero-shot transferability to new/unseen tasks. The evaluation demonstrates the efficiency and efficacy of SAM-Med3D, as well as its promising application to diverse downstream tasks as a pretrained model. Our approach illustrates that substantial medical resources can be harnessed to develop a general-purpose medical AI for various potential applications. Our dataset, code, and models are available at: https://github.com/uni-medical/SAM-Med3D.

Adaptively Distilled ControlNet: Accelerated Training and Superior Sampling for Medical Image Synthesis

Kunpeng Qiu, Zhiying Zhou, Yongxin Guo

arxiv logopreprintJul 31 2025
Medical image annotation is constrained by privacy concerns and labor-intensive labeling, significantly limiting the performance and generalization of segmentation models. While mask-controllable diffusion models excel in synthesis, they struggle with precise lesion-mask alignment. We propose \textbf{Adaptively Distilled ControlNet}, a task-agnostic framework that accelerates training and optimization through dual-model distillation. Specifically, during training, a teacher model, conditioned on mask-image pairs, regularizes a mask-only student model via predicted noise alignment in parameter space, further enhanced by adaptive regularization based on lesion-background ratios. During sampling, only the student model is used, enabling privacy-preserving medical image generation. Comprehensive evaluations on two distinct medical datasets demonstrate state-of-the-art performance: TransUNet improves mDice/mIoU by 2.4%/4.2% on KiTS19, while SANet achieves 2.6%/3.5% gains on Polyps, highlighting its effectiveness and superiority. Code is available at GitHub.
Page 24 of 60594 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.