Sort by:
Page 9 of 42417 results

Extreme Cardiac MRI Analysis under Respiratory Motion: Results of the CMRxMotion Challenge

Kang Wang, Chen Qin, Zhang Shi, Haoran Wang, Xiwen Zhang, Chen Chen, Cheng Ouyang, Chengliang Dai, Yuanhan Mo, Chenchen Dai, Xutong Kuang, Ruizhe Li, Xin Chen, Xiuzheng Yue, Song Tian, Alejandro Mora-Rubio, Kumaradevan Punithakumar, Shizhan Gong, Qi Dou, Sina Amirrajab, Yasmina Al Khalil, Cian M. Scannell, Lexiaozi Fan, Huili Yang, Xiaowu Sun, Rob van der Geest, Tewodros Weldebirhan Arega, Fabrice Meriaudeau, Caner Özer, Amin Ranem, John Kalkhof, İlkay Öksüz, Anirban Mukhopadhyay, Abdul Qayyum, Moona Mazher, Steven A Niederer, Carles Garcia-Cabrera, Eric Arazo, Michal K. Grzeszczyk, Szymon Płotka, Wanqin Ma, Xiaomeng Li, Rongjun Ge, Yongqing Kou, Xinrong Chen, He Wang, Chengyan Wang, Wenjia Bai, Shuo Wang

arxiv logopreprintJul 25 2025
Deep learning models have achieved state-of-the-art performance in automated Cardiac Magnetic Resonance (CMR) analysis. However, the efficacy of these models is highly dependent on the availability of high-quality, artifact-free images. In clinical practice, CMR acquisitions are frequently degraded by respiratory motion, yet the robustness of deep learning models against such artifacts remains an underexplored problem. To promote research in this domain, we organized the MICCAI CMRxMotion challenge. We curated and publicly released a dataset of 320 CMR cine series from 40 healthy volunteers who performed specific breathing protocols to induce a controlled spectrum of motion artifacts. The challenge comprised two tasks: 1) automated image quality assessment to classify images based on motion severity, and 2) robust myocardial segmentation in the presence of motion artifacts. A total of 22 algorithms were submitted and evaluated on the two designated tasks. This paper presents a comprehensive overview of the challenge design and dataset, reports the evaluation results for the top-performing methods, and further investigates the impact of motion artifacts on five clinically relevant biomarkers. All resources and code are publicly available at: https://github.com/CMRxMotion

SP-Mamba: Spatial-Perception State Space Model for Unsupervised Medical Anomaly Detection

Rui Pan, Ruiying Lu

arxiv logopreprintJul 25 2025
Radiography imaging protocols target on specific anatomical regions, resulting in highly consistent images with recurrent structural patterns across patients. Recent advances in medical anomaly detection have demonstrated the effectiveness of CNN- and transformer-based approaches. However, CNNs exhibit limitations in capturing long-range dependencies, while transformers suffer from quadratic computational complexity. In contrast, Mamba-based models, leveraging superior long-range modeling, structural feature extraction, and linear computational efficiency, have emerged as a promising alternative. To capitalize on the inherent structural regularity of medical images, this study introduces SP-Mamba, a spatial-perception Mamba framework for unsupervised medical anomaly detection. The window-sliding prototype learning and Circular-Hilbert scanning-based Mamba are introduced to better exploit consistent anatomical patterns and leverage spatial information for medical anomaly detection. Furthermore, we excavate the concentration and contrast characteristics of anomaly maps for improving anomaly detection. Extensive experiments on three diverse medical anomaly detection benchmarks confirm the proposed method's state-of-the-art performance, validating its efficacy and robustness. The code is available at https://github.com/Ray-RuiPan/SP-Mamba.

RegScore: Scoring Systems for Regression Tasks

Michal K. Grzeszczyk, Tomasz Szczepański, Pawel Renc, Siyeop Yoon, Jerome Charton, Tomasz Trzciński, Arkadiusz Sitek

arxiv logopreprintJul 25 2025
Scoring systems are widely adopted in medical applications for their inherent simplicity and transparency, particularly for classification tasks involving tabular data. In this work, we introduce RegScore, a novel, sparse, and interpretable scoring system specifically designed for regression tasks. Unlike conventional scoring systems constrained to integer-valued coefficients, RegScore leverages beam search and k-sparse ridge regression to relax these restrictions, thus enhancing predictive performance. We extend RegScore to bimodal deep learning by integrating tabular data with medical images. We utilize the classification token from the TIP (Tabular Image Pretraining) transformer to generate Personalized Linear Regression parameters and a Personalized RegScore, enabling individualized scoring. We demonstrate the effectiveness of RegScore by estimating mean Pulmonary Artery Pressure using tabular data and further refine these estimates by incorporating cardiac MRI images. Experimental results show that RegScore and its personalized bimodal extensions achieve performance comparable to, or better than, state-of-the-art black-box models. Our method provides a transparent and interpretable approach for regression tasks in clinical settings, promoting more informed and trustworthy decision-making. We provide our code at https://github.com/SanoScience/RegScore.

Joint Holistic and Lesion Controllable Mammogram Synthesis via Gated Conditional Diffusion Model

Xin Li, Kaixiang Yang, Qiang Li, Zhiwei Wang

arxiv logopreprintJul 25 2025
Mammography is the most commonly used imaging modality for breast cancer screening, driving an increasing demand for deep-learning techniques to support large-scale analysis. However, the development of accurate and robust methods is often limited by insufficient data availability and a lack of diversity in lesion characteristics. While generative models offer a promising solution for data synthesis, current approaches often fail to adequately emphasize lesion-specific features and their relationships with surrounding tissues. In this paper, we propose Gated Conditional Diffusion Model (GCDM), a novel framework designed to jointly synthesize holistic mammogram images and localized lesions. GCDM is built upon a latent denoising diffusion framework, where the noised latent image is concatenated with a soft mask embedding that represents breast, lesion, and their transitional regions, ensuring anatomical coherence between them during the denoising process. To further emphasize lesion-specific features, GCDM incorporates a gated conditioning branch that guides the denoising process by dynamically selecting and fusing the most relevant radiomic and geometric properties of lesions, effectively capturing their interplay. Experimental results demonstrate that GCDM achieves precise control over small lesion areas while enhancing the realism and diversity of synthesized mammograms. These advancements position GCDM as a promising tool for clinical applications in mammogram synthesis. Our code is available at https://github.com/lixinHUST/Gated-Conditional-Diffusion-Model/

Is Exchangeability better than I.I.D to handle Data Distribution Shifts while Pooling Data for Data-scarce Medical image segmentation?

Ayush Roy, Samin Enam, Jun Xia, Vishnu Suresh Lokhande, Won Hwa Kim

arxiv logopreprintJul 25 2025
Data scarcity is a major challenge in medical imaging, particularly for deep learning models. While data pooling (combining datasets from multiple sources) and data addition (adding more data from a new dataset) have been shown to enhance model performance, they are not without complications. Specifically, increasing the size of the training dataset through pooling or addition can induce distributional shifts, negatively affecting downstream model performance, a phenomenon known as the "Data Addition Dilemma". While the traditional i.i.d. assumption may not hold in multi-source contexts, assuming exchangeability across datasets provides a more practical framework for data pooling. In this work, we investigate medical image segmentation under these conditions, drawing insights from causal frameworks to propose a method for controlling foreground-background feature discrepancies across all layers of deep networks. This approach improves feature representations, which are crucial in data-addition scenarios. Our method achieves state-of-the-art segmentation performance on histopathology and ultrasound images across five datasets, including a novel ultrasound dataset that we have curated and contributed. Qualitative results demonstrate more refined and accurate segmentation maps compared to prominent baselines across three model architectures. The code will be available on Github.

TextSAM-EUS: Text Prompt Learning for SAM to Accurately Segment Pancreatic Tumor in Endoscopic Ultrasound

Pascal Spiegler, Taha Koleilat, Arash Harirpoush, Corey S. Miller, Hassan Rivaz, Marta Kersten-Oertel, Yiming Xiao

arxiv logopreprintJul 24 2025
Pancreatic cancer carries a poor prognosis and relies on endoscopic ultrasound (EUS) for targeted biopsy and radiotherapy. However, the speckle noise, low contrast, and unintuitive appearance of EUS make segmentation of pancreatic tumors with fully supervised deep learning (DL) models both error-prone and dependent on large, expert-curated annotation datasets. To address these challenges, we present TextSAM-EUS, a novel, lightweight, text-driven adaptation of the Segment Anything Model (SAM) that requires no manual geometric prompts at inference. Our approach leverages text prompt learning (context optimization) through the BiomedCLIP text encoder in conjunction with a LoRA-based adaptation of SAM's architecture to enable automatic pancreatic tumor segmentation in EUS, tuning only 0.86% of the total parameters. On the public Endoscopic Ultrasound Database of the Pancreas, TextSAM-EUS with automatic prompts attains 82.69% Dice and 85.28% normalized surface distance (NSD), and with manual geometric prompts reaches 83.10% Dice and 85.70% NSD, outperforming both existing state-of-the-art (SOTA) supervised DL models and foundation models (e.g., SAM and its variants). As the first attempt to incorporate prompt learning in SAM-based medical image segmentation, TextSAM-EUS offers a practical option for efficient and robust automatic EUS segmentation. Our code will be publicly available upon acceptance.

Vox-MMSD: Voxel-wise Multi-scale and Multi-modal Self-Distillation for Self-supervised Brain Tumor Segmentation.

Zhou Y, Wu J, Fu J, Yue Q, Liao W, Zhang S, Zhang S, Wang G

pubmed logopapersJul 24 2025
Many deep learning methods have been proposed for brain tumor segmentation from multi-modal Magnetic Resonance Imaging (MRI) scans that are important for accurate diagnosis and treatment planning. However, supervised learning needs a large amount of labeled data to perform well, where the time-consuming and expensive annotation process or small size of training set will limit the model's performance. To deal with these problems, self-supervised pre-training is an appealing solution due to its feature learning ability from a set of unlabeled images that is transferable to downstream datasets with a small size. However, existing methods often overlook the utilization of multi-modal information and multi-scale features. Therefore, we propose a novel Self-Supervised Learning (SSL) framework that fully leverages multi-modal MRI scans to extract modality-invariant features for brain tumor segmentation. First, we employ a Siamese Block-wise Modality Masking (SiaBloMM) strategy that creates more diverse model inputs for image restoration to simultaneously learn contextual and modality-invariant features. Meanwhile, we proposed Overlapping Random Modality Sampling (ORMS) to sample voxel pairs with multi-scale features for self-distillation, enhancing voxel-wise representation which is important for segmentation tasks. Experiments on the BraTS 2024 adult glioma segmentation dataset showed that with a small amount of labeled data for fine-tuning, our method improved the average Dice by 3.80 percentage points. In addition, when transferred to three other small downstream datasets with brain tumors from different patient groups, our method also improved the dice by 3.47 percentage points on average, and outperformed several existing SSL methods. The code is availiable at https://github.com/HiLab-git/Vox-MMSD.

TextSAM-EUS: Text Prompt Learning for SAM to Accurately Segment Pancreatic Tumor in Endoscopic Ultrasound

Pascal Spiegler, Taha Koleilat, Arash Harirpoush, Corey S. Miller, Hassan Rivaz, Marta Kersten-Oertel, Yiming Xiao

arxiv logopreprintJul 24 2025
Pancreatic cancer carries a poor prognosis and relies on endoscopic ultrasound (EUS) for targeted biopsy and radiotherapy. However, the speckle noise, low contrast, and unintuitive appearance of EUS make segmentation of pancreatic tumors with fully supervised deep learning (DL) models both error-prone and dependent on large, expert-curated annotation datasets. To address these challenges, we present TextSAM-EUS, a novel, lightweight, text-driven adaptation of the Segment Anything Model (SAM) that requires no manual geometric prompts at inference. Our approach leverages text prompt learning (context optimization) through the BiomedCLIP text encoder in conjunction with a LoRA-based adaptation of SAM's architecture to enable automatic pancreatic tumor segmentation in EUS, tuning only 0.86% of the total parameters. On the public Endoscopic Ultrasound Database of the Pancreas, TextSAM-EUS with automatic prompts attains 82.69% Dice and 85.28% normalized surface distance (NSD), and with manual geometric prompts reaches 83.10% Dice and 85.70% NSD, outperforming both existing state-of-the-art (SOTA) supervised DL models and foundation models (e.g., SAM and its variants). As the first attempt to incorporate prompt learning in SAM-based medical image segmentation, TextSAM-EUS offers a practical option for efficient and robust automatic EUS segmentation. Code is available at https://github.com/HealthX-Lab/TextSAM-EUS .

DGEAHorNet: high-order spatial interaction network with dual cross global efficient attention for medical image segmentation.

Peng H, An X, Chen X, Chen Z

pubmed logopapersJul 24 2025
Medical image segmentation is a complex and challenging task, which aims to accurately segment various structures or abnormal regions in medical images. However, obtaining accurate segmentation results is difficult because of the great uncertainty in the shape, location, and scale of the target region. To address these challenges, we propose a higher-order spatial interaction framework with dual cross global efficient attention (DGEAHorNet), which employs a neural network architecture based on recursive gate convolution to adequately extract multi-scale contextual information from images. Specifically, a Dual Cross-Attentions (DCA) is added to the skip connection that can effectively blend multi-stage encoder features and narrow the semantic gap. In the bottleneck stage, global channel spatial attention module (GCSAM) is used to extract image global information. To obtain better feature representation, we feed the output from the GCSAM into the multi-branch dense layer (SENetV2) for excitation. Furthermore, we adopt Depthwise Over-parameterized Convolutional Layer (DO-Conv) in order to replace the common convolutional layer in the input and output part of our network, then add Efficient Attention (EA) to diminish computational complexity and enhance our model's performance. For evaluating the effectiveness of our proposed DGEAHorNet, we conduct comprehensive experiments on four publicly-available datasets, and achieving 0.9320, 0.9337, 0.9312 and 0.7799 in Dice similarity coefficient on ISIC2018, ISIC2017, CVC-ClinicDB and HRF respectively. Our results show that DGEAHorNet has better performance compared with advanced methods. The code is publicly available at https://github.com/penghaixin/mymodel .

Q-Former Autoencoder: A Modern Framework for Medical Anomaly Detection

Francesco Dalmonte, Emirhan Bayar, Emre Akbas, Mariana-Iuliana Georgescu

arxiv logopreprintJul 24 2025
Anomaly detection in medical images is an important yet challenging task due to the diversity of possible anomalies and the practical impossibility of collecting comprehensively annotated data sets. In this work, we tackle unsupervised medical anomaly detection proposing a modernized autoencoder-based framework, the Q-Former Autoencoder, that leverages state-of-the-art pretrained vision foundation models, such as DINO, DINOv2 and Masked Autoencoder. Instead of training encoders from scratch, we directly utilize frozen vision foundation models as feature extractors, enabling rich, multi-stage, high-level representations without domain-specific fine-tuning. We propose the usage of the Q-Former architecture as the bottleneck, which enables the control of the length of the reconstruction sequence, while efficiently aggregating multiscale features. Additionally, we incorporate a perceptual loss computed using features from a pretrained Masked Autoencoder, guiding the reconstruction towards semantically meaningful structures. Our framework is evaluated on four diverse medical anomaly detection benchmarks, achieving state-of-the-art results on BraTS2021, RESC, and RSNA. Our results highlight the potential of vision foundation model encoders, pretrained on natural images, to generalize effectively to medical image analysis tasks without further fine-tuning. We release the code and models at https://github.com/emirhanbayar/QFAE.
Page 9 of 42417 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.