Sort by:
Page 70 of 81807 results

Graph Mamba for Efficient Whole Slide Image Understanding

Jiaxuan Lu, Junyan Shi, Yuhui Lin, Fang Yan, Yue Gao, Shaoting Zhang, Xiaosong Wang

arxiv logopreprintMay 23 2025
Whole Slide Images (WSIs) in histopathology present a significant challenge for large-scale medical image analysis due to their high resolution, large size, and complex tile relationships. Existing Multiple Instance Learning (MIL) methods, such as Graph Neural Networks (GNNs) and Transformer-based models, face limitations in scalability and computational cost. To bridge this gap, we propose the WSI-GMamba framework, which synergistically combines the relational modeling strengths of GNNs with the efficiency of Mamba, the State Space Model designed for sequence learning. The proposed GMamba block integrates Message Passing, Graph Scanning & Flattening, and feature aggregation via a Bidirectional State Space Model (Bi-SSM), achieving Transformer-level performance with 7* fewer FLOPs. By leveraging the complementary strengths of lightweight GNNs and Mamba, the WSI-GMamba framework delivers a scalable solution for large-scale WSI analysis, offering both high accuracy and computational efficiency for slide-level classification.

AutoMiSeg: Automatic Medical Image Segmentation via Test-Time Adaptation of Foundation Models

Xingjian Li, Qifeng Wu, Colleen Que, Yiran Ding, Adithya S. Ubaradka, Jianhua Xing, Tianyang Wang, Min Xu

arxiv logopreprintMay 23 2025
Medical image segmentation is vital for clinical diagnosis, yet current deep learning methods often demand extensive expert effort, i.e., either through annotating large training datasets or providing prompts at inference time for each new case. This paper introduces a zero-shot and automatic segmentation pipeline that combines off-the-shelf vision-language and segmentation foundation models. Given a medical image and a task definition (e.g., "segment the optic disc in an eye fundus image"), our method uses a grounding model to generate an initial bounding box, followed by a visual prompt boosting module that enhance the prompts, which are then processed by a promptable segmentation model to produce the final mask. To address the challenges of domain gap and result verification, we introduce a test-time adaptation framework featuring a set of learnable adaptors that align the medical inputs with foundation model representations. Its hyperparameters are optimized via Bayesian Optimization, guided by a proxy validation model without requiring ground-truth labels. Our pipeline offers an annotation-efficient and scalable solution for zero-shot medical image segmentation across diverse tasks. Our pipeline is evaluated on seven diverse medical imaging datasets and shows promising results. By proper decomposition and test-time adaptation, our fully automatic pipeline performs competitively with weakly-prompted interactive foundation models.

AMVLM: Alignment-Multiplicity Aware Vision-Language Model for Semi-Supervised Medical Image Segmentation.

Pan Q, Li Z, Qiao W, Lou J, Yang Q, Yang G, Ji B

pubmed logopapersMay 23 2025
Low-quality pseudo labels pose a significant obstacle in semi-supervised medical image segmentation (SSMIS), impeding consistency learning on unlabeled data. Leveraging vision-language model (VLM) holds promise in ameliorating pseudo label quality by employing textual prompts to delineate segmentation regions, but it faces the challenge of cross-modal alignment uncertainty due to multiple correspondences (multiple images/texts tend to correspond to one text/image). Existing VLMs address this challenge by modeling semantics as distributions but such distributions lead to semantic degradation. To address these problems, we propose Alignment-Multiplicity Aware Vision-Language Model (AMVLM), a new VLM pre-training paradigm with two novel similarity metric strategies. (i) Cross-modal Similarity Supervision (CSS) proposes a probability distribution transformer to supervise similarity scores across fine-granularity semantics through measuring cross-modal distribution disparities, thus learning cross-modal multiple alignments. (ii) Intra-modal Contrastive Learning (ICL) takes into account the similarity metric of coarse-fine granularity information within each modality to encourage cross-modal semantic consistency. Furthermore, using the pretrained AMVLM, we propose a pioneering text-guided SSMIS network to compensate for the quality deficiencies of pseudo-labels. This network incorporates a text mask generator to produce multimodal supervision information, enhancing pseudo label quality and the model's consistency learning. Extensive experimentation validates the efficacy of our AMVLM-driven SSMIS, showcasing superior performance across four publicly available datasets. The code will be available at: https://github.com/QingtaoPan/AMVLM.

Brightness-Invariant Tracking Estimation in Tagged MRI

Zhangxing Bian, Shuwen Wei, Xiao Liang, Yuan-Chiao Lu, Samuel W. Remedios, Fangxu Xing, Jonghye Woo, Dzung L. Pham, Aaron Carass, Philip V. Bayly, Jiachen Zhuo, Ahmed Alshareef, Jerry L. Prince

arxiv logopreprintMay 23 2025
Magnetic resonance (MR) tagging is an imaging technique for noninvasively tracking tissue motion in vivo by creating a visible pattern of magnetization saturation (tags) that deforms with the tissue. Due to longitudinal relaxation and progression to steady-state, the tags and tissue brightnesses change over time, which makes tracking with optical flow methods error-prone. Although Fourier methods can alleviate these problems, they are also sensitive to brightness changes as well as spectral spreading due to motion. To address these problems, we introduce the brightness-invariant tracking estimation (BRITE) technique for tagged MRI. BRITE disentangles the anatomy from the tag pattern in the observed tagged image sequence and simultaneously estimates the Lagrangian motion. The inherent ill-posedness of this problem is addressed by leveraging the expressive power of denoising diffusion probabilistic models to represent the probabilistic distribution of the underlying anatomy and the flexibility of physics-informed neural networks to estimate biologically-plausible motion. A set of tagged MR images of a gel phantom was acquired with various tag periods and imaging flip angles to demonstrate the impact of brightness variations and to validate our method. The results show that BRITE achieves more accurate motion and strain estimates as compared to other state of the art methods, while also being resistant to tag fading.

Multimodal fusion model for prognostic prediction and radiotherapy response assessment in head and neck squamous cell carcinoma.

Tian R, Hou F, Zhang H, Yu G, Yang P, Li J, Yuan T, Chen X, Chen Y, Hao Y, Yao Y, Zhao H, Yu P, Fang H, Song L, Li A, Liu Z, Lv H, Yu D, Cheng H, Mao N, Song X

pubmed logopapersMay 23 2025
Accurate prediction of prognosis and postoperative radiotherapy response is critical for personalized treatment in head and neck squamous cell carcinoma (HNSCC). We developed a multimodal deep learning model (MDLM) integrating computed tomography, whole-slide images, and clinical features from 1087 HNSCC patients across multiple centers. The MDLM exhibited good performance in predicting overall survival (OS) and disease-free survival in external test cohorts. Additionally, the MDLM outperformed unimodal models. Patients with a high-risk score who underwent postoperative radiotherapy exhibited prolonged OS compared to those who did not (P = 0.016), whereas no significant improvement in OS was observed among patients with a low-risk score (P = 0.898). Biological exploration indicated that the model may be related to changes in the cytochrome P450 metabolic pathway, tumor microenvironment, and myeloid-derived cell subpopulations. Overall, the MDLM effectively predicts prognosis and postoperative radiotherapy response, offering a promising tool for personalized HNSCC therapy.

Integrating multi-omics data with artificial intelligence to decipher the role of tumor-infiltrating lymphocytes in tumor immunotherapy.

Xie T, Xue H, Huang A, Yan H, Yuan J

pubmed logopapersMay 23 2025
Tumor-infiltrating lymphocytes (TILs) are capable of recognizing tumor antigens, impacting tumor prognosis, predicting the efficacy of neoadjuvant therapies, contributing to the development of new cell-based immunotherapies, studying the tumor immune microenvironment, and identifying novel biomarkers. Traditional methods for evaluating TILs primarily rely on histopathological examination using standard hematoxylin and eosin staining or immunohistochemical staining, with manual cell counting under a microscope. These methods are time-consuming and subject to significant observer variability and error. Recently, artificial intelligence (AI) has rapidly advanced in the field of medical imaging, particularly with deep learning algorithms based on convolutional neural networks. AI has shown promise as a powerful tool for the quantitative evaluation of tumor biomarkers. The advent of AI offers new opportunities for the automated and standardized assessment of TILs. This review provides an overview of the advancements in the application of AI for assessing TILs from multiple perspectives. It specifically focuses on AI-driven approaches for identifying TILs in tumor tissue images, automating TILs quantification, recognizing TILs subpopulations, and analyzing the spatial distribution patterns of TILs. The review aims to elucidate the prognostic value of TILs in various cancers, as well as their predictive capacity for responses to immunotherapy and neoadjuvant therapy. Furthermore, the review explores the integration of AI with other emerging technologies, such as single-cell sequencing, multiplex immunofluorescence, spatial transcriptomics, and multimodal approaches, to enhance the comprehensive study of TILs and further elucidate their clinical utility in tumor treatment and prognosis.

A Deep Learning Vision-Language Model for Diagnosing Pediatric Dental Diseases

Pham, T.

medrxiv logopreprintMay 22 2025
This study proposes a deep learning vision-language model for the automated diagnosis of pediatric dental diseases, with a focus on differentiating between caries and periapical infections. The model integrates visual features extracted from panoramic radiographs using methods of non-linear dynamics and textural encoding with textual descriptions generated by a large language model. These multimodal features are concatenated and used to train a 1D-CNN classifier. Experimental results demonstrate that the proposed model outperforms conventional convolutional neural networks and standalone language-based approaches, achieving high accuracy (90%), sensitivity (92%), precision (92%), and an AUC of 0.96. This work highlights the value of combining structured visual and textual representations in improving diagnostic accuracy and interpretability in dental radiology. The approach offers a promising direction for the development of context-aware, AI-assisted diagnostic tools in pediatric dental care.

Cross-Scale Texture Supplementation for Reference-based Medical Image Super-Resolution.

Li Y, Hao W, Zeng H, Wang L, Xu J, Routray S, Jhaveri RH, Gadekallu TR

pubmed logopapersMay 22 2025
Magnetic Resonance Imaging (MRI) is a widely used medical imaging technique, but its resolution is often limited by acquisition time constraints, potentially compromising diagnostic accuracy. Reference-based Image Super-Resolution (RefSR) has shown promising performance in addressing such challenges by leveraging external high-resolution (HR) reference images to enhance the quality of low-resolution (LR) images. The core objective of RefSR is to accurately establish correspondences between the reference HR image and the LR images. In pursuit of this objective, this paper develops a Self-rectified Texture Supplementation network for RefSR (STS-SR) to enhance fine details in MRI images and support the expanding role of autonomous AI in healthcare. Our network comprises a texture-specified selfrectified feature transfer module and a cross-scale texture complementary network. The feature transfer module employs highfrequency filtering to facilitate the network concentrating on fine details. To better exploit the information from both the reference and LR images, our cross-scale texture complementary module incorporates the All-ViT and Swin Transformer layers to achieve feature aggregation at multiple scales, which enables high-quality image enhancement that is critical for autonomous AI systems in healthcare to make accurate decisions. Extensive experiments are performed across various benchmark datasets. The results validate the effectiveness of our method and demonstrate that the method produces state-of-the-art performance as compared to existing approaches. This advancement enables autonomous AI systems to utilize high-quality MRI images for more accurate diagnostics and reliable predictions.

ESR Essentials: a step-by-step guide of segmentation for radiologists-practice recommendations by the European Society of Medical Imaging Informatics.

Chupetlovska K, Akinci D'Antonoli T, Bodalal Z, Abdelatty MA, Erenstein H, Santinha J, Huisman M, Visser JJ, Trebeschi S, Groot Lipman KBW

pubmed logopapersMay 22 2025
High-quality segmentation is important for AI-driven radiological research and clinical practice, with the potential to play an even more prominent role in the future. As medical imaging advances, accurately segmenting anatomical and pathological structures is increasingly used to obtain quantitative data and valuable insights. Segmentation and volumetric analysis could enable more precise diagnosis, treatment planning, and patient monitoring. These guidelines aim to improve segmentation accuracy and consistency, allowing for better decision-making in both research and clinical environments. Practical advice on planning and organization is provided, focusing on quality, precision, and communication among clinical teams. Additionally, tips and strategies for improving segmentation practices in radiology and radiation oncology are discussed, as are potential pitfalls to avoid. KEY POINTS: As AI continues to advance, volumetry will become more integrated into clinical practice, making it essential for radiologists to stay informed about its applications in diagnosis and treatment planning. There is a significant lack of practical guidelines and resources tailored specifically for radiologists on technical topics like segmentation and volumetric analysis. Establishing clear rules and best practices for segmentation can streamline volumetric assessment in clinical settings, making it easier to manage and leading to more accurate decision-making for patient care.

Mitigating Overfitting in Medical Imaging: Self-Supervised Pretraining vs. ImageNet Transfer Learning for Dermatological Diagnosis

Iván Matas, Carmen Serrano, Miguel Nogales, David Moreno, Lara Ferrándiz, Teresa Ojeda, Begoña Acha

arxiv logopreprintMay 22 2025
Deep learning has transformed computer vision but relies heavily on large labeled datasets and computational resources. Transfer learning, particularly fine-tuning pretrained models, offers a practical alternative; however, models pretrained on natural image datasets such as ImageNet may fail to capture domain-specific characteristics in medical imaging. This study introduces an unsupervised learning framework that extracts high-value dermatological features instead of relying solely on ImageNet-based pretraining. We employ a Variational Autoencoder (VAE) trained from scratch on a proprietary dermatological dataset, allowing the model to learn a structured and clinically relevant latent space. This self-supervised feature extractor is then compared to an ImageNet-pretrained backbone under identical classification conditions, highlighting the trade-offs between general-purpose and domain-specific pretraining. Our results reveal distinct learning patterns. The self-supervised model achieves a final validation loss of 0.110 (-33.33%), while the ImageNet-pretrained model stagnates at 0.100 (-16.67%), indicating overfitting. Accuracy trends confirm this: the self-supervised model improves from 45% to 65% (+44.44%) with a near-zero overfitting gap, whereas the ImageNet-pretrained model reaches 87% (+50.00%) but plateaus at 75% (+19.05%), with its overfitting gap increasing to +0.060. These findings suggest that while ImageNet pretraining accelerates convergence, it also amplifies overfitting on non-clinically relevant features. In contrast, self-supervised learning achieves steady improvements, stronger generalization, and superior adaptability, underscoring the importance of domain-specific feature extraction in medical imaging.
Page 70 of 81807 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.