Latest Papers on Radiology AI. Tags: Mixed Modality

Towards Reliable Healthcare Imaging: A Multifaceted Approach in Class Imbalance Handling for Medical Image Segmentation.

Cui L, Xu M, Liu C, Liu T, Yan X, Zhang Y, Yang X

•papers•Jul 7 2025

Class imbalance is a dominant challenge in medical image segmentation when dealing with MRI images from highly imbalanced datasets. This study introduces a comprehensive, multifaceted approach to enhance the accuracy and reliability of segmentation models under such conditions. Our model integrates advanced data augmentation, innovative algorithmic adjustments, and novel architectural features to address class label distribution effectively. To ensure the multiple aspects of training process, we have customized the data augmentation technique for medical imaging with multi-dimensional angles. The multi-dimensional augmentation technique helps to reduce the bias towards majority classes. We have implemented novel attention mechanisms, i.e., Enhanced Attention Module (EAM) and spatial attention. These attention mechanisms enhance the focus of the model on the most relevant features. Further, our architecture incorporates a dual decoder system and Pooling Integration Layer (PIL) to capture accurate foreground and background details. We also introduce a hybrid loss function, which is designed to handle the class imbalance by guiding the training process. For experimental purposes, we have used multiple datasets such as Digital Database Thyroid Image (DDTI), Breast Ultrasound Images Dataset (BUSI) and LiTS MICCAI 2017 to demonstrate the prowess of the proposed network using key evaluation metrics, i.e., IoU, Dice coefficient, precision, and recall.

Mixed Modality Segmentation Methodology In Silico

Early warning and stratification of the elderly cardiopulmonary dysfunction-related diseases: multicentre prospective study protocol.

Zhou X, Jin Q, Xia Y, Guan Y, Zhang Z, Guo Z, Liu Z, Li C, Bai Y, Hou Y, Zhou M, Liao WH, Lin H, Wang P, Liu S, Fan L

•papers•Jul 5 2025

In China, there is a lack of standardised clinical imaging databases for multidimensional evaluation of cardiopulmonary diseases. To address this gap, this study protocol launched a project to build a clinical imaging technology integration and a multicentre database for early warning and stratification of cardiopulmonary dysfunction in the elderly. This study employs a cross-sectional design, enrolling over 6000 elderly participants from five regions across China to evaluate cardiopulmonary function and related diseases. Based on clinical criteria, participants are categorized into three groups: a healthy cardiopulmonary function group, a functional decrease group and an established cardiopulmonary diseases group. All subjects will undergo comprehensive assessments including chest CT scans, echocardiography, and laboratory examinations. Additionally, at least 50 subjects will undergo cardiopulmonary exercise testing (CPET). By leveraging artificial intelligence technology, multimodal data will be integrated to establish reference ranges for cardiopulmonary function in the elderly population, as well as to develop early-warning models and severity grading standard models. The study has been approved by the local ethics committee of Shanghai Changzheng Hospital (approval number: 2022SL069A). All the participants will sign the informed consent. The results will be disseminated through peer-reviewed publications and conferences.

Mixed Modality Classification Chest Prospective Concept Consortium Open Dataset

A novel recursive transformer-based U-Net architecture for enhanced multi-scale medical image segmentation.

Li S, Liu X, Fu M, Khelifi F

•papers•Jul 5 2025

Automatic medical image segmentation techniques are vital for assisting clinicians in making accurate diagnoses and treatment plans. Although the U-shaped network (U-Net) has been widely adopted in medical image analysis, it still faces challenges in capturing long-range dependencies, particularly in complex and textured medical images where anatomical structures often blend into the surrounding background. To address these limitations, a novel network architecture, called recursive transformer-based U-Net (ReT-UNet), which integrates recursive feature learning and transformer technology, is proposed. One of the key innovations of ReT-UNet is the multi-scale global feature fusion (Multi-GF) module, inspired by transformer models and multi-scale pooling mechanisms. This module captures long-range dependencies, enhancing the abstraction and contextual understanding of multi-level features. Additionally, a recursive feature accumulation block is introduced to iteratively update features across layers, improving the network's ability to model spatial correlations and represent deep features in medical images. To improve sensitivity to local details, a lightweight atrous spatial pyramid pooling (ASPP) module is appended after the Multi-GF module. Furthermore, the segmentation head is redesigned to emphasize feature aggregation and fusion. During the encoding phase, a hybrid pooling layer is employed to ensure comprehensive feature sampling, thereby enabling a broader range of feature representation and improving detailed information learning. Results: The proposed method has been evaluated through ablation experiments, demonstrating generally consistent performance across multiple trials. When applied to cardiac, pulmonary nodule, and polyp segmentation datasets, the method showed a reduction in mis-segmented regions. The experimental results suggest that the approach can improve segmentation accuracy and stability compared to competing state-of-the-art methods. Experimental findings highlight the superiority of the proposed ReT-UNet over related methods and demonstrate its potential for applications in medical image segmentation.

Mixed Modality Segmentation Methodology In Silico Academic Lab

DHR-Net: Dynamic Harmonized registration network for multimodal medical images.

Yang X, Li D, Chen S, Deng L, Wang J, Huang S

•papers•Jul 5 2025

Although deep learning has driven remarkable advancements in medical image registration, deep neural network-based non-rigid deformation field generation methods demonstrate high accuracy in single-modality scenarios. However, multi-modal medical image registration still faces critical challenges. To address the issues of insufficient anatomical consistency and unstable deformation field optimization in cross-modal registration tasks among existing methods, this paper proposes an end-to-end medical image registration method based on a Dynamic Harmonized Registration framework (DHR-Net). DHR-Net employs a cascaded two-stage architecture, comprising a translation network and a registration network that operate in sequential processing phases. Furthermore, we propose a loss function based on the Noise Contrastive Estimation framework, which enhances anatomical consistency in cross-modal translation by maximizing mutual information between input and transformed image patches. This loss function incorporates a dynamic temperature adjustment mechanism that progressively optimizes feature contrast constraints during training to improve high-frequency detail preservation, thereby better constraining the topological structure of target images. Experiments conducted on the M&M Heart Dataset demonstrate that DHR-Net outperforms existing methods in registration accuracy, deformation field smoothness, and cross-modal robustness. The framework significantly enhances the registration quality of cardiac images while demonstrating exceptional performance in preserving anatomical structures, exhibiting promising potential for clinical applications.

Mixed Modality Registration Cardiac Methodology In Silico

Machine learning approach using radiomics features to distinguish odontogenic cysts and tumours.

Muraoka H, Kaneda T, Ito K, Otsuka K, Tokunaga S

•papers•Jul 4 2025

Although most odontogenic lesions in the jaw are benign, treatment varies widely depending on the nature of the lesion. This study was performed to assess the ability of a machine learning (ML) model using computed tomography (CT) and magnetic resonance imaging (MRI) radiomic features to classify odontogenic cysts and tumours. CT and MRI data from patients with odontogenic lesions including dentigerous cysts, odontogenic keratocysts, and ameloblastomas were analysed. Manual segmentation of the CT image and the apparent diffusion coefficient (ADC) map from diffusion-weighted MRI was performed to extract radiomic features. The extracted radiomic features were split into training (70%) and test (30%) sets. The random forest model was adjusted or optimized using 5-fold stratified cross-validation within the training set and assessed on a separate hold-out test set. Analysis of the CT-based ML model showed cross-validation accuracy of 0.59 and 0.60 for the training set and test set, respectively, with precision, recall, and F1 score all being 0.57. Analysis of the ADC-based ML model showed cross-validation accuracy of 0.90 and 0.94 for the training set and test set, respectively; the precision, recall, and F1 score were all 0.87. ML models, particularly when using MRI radiological features, can effectively classify odontogenic lesions.

Mixed Modality Classification Retrospective Clinical In Silico Academic Lab

Causal-SAM-LLM: Large Language Models as Causal Reasoners for Robust Medical Segmentation

Tao Tang, Shijie Xu, Yiting Wu, Zhixiang Lu

•preprint•Jul 4 2025

The clinical utility of deep learning models for medical image segmentation is severely constrained by their inability to generalize to unseen domains. This failure is often rooted in the models learning spurious correlations between anatomical content and domain-specific imaging styles. To overcome this fundamental challenge, we introduce Causal-SAM-LLM, a novel framework that elevates Large Language Models (LLMs) to the role of causal reasoners. Our framework, built upon a frozen Segment Anything Model (SAM) encoder, incorporates two synergistic innovations. First, Linguistic Adversarial Disentanglement (LAD) employs a Vision-Language Model to generate rich, textual descriptions of confounding image styles. By training the segmentation model's features to be contrastively dissimilar to these style descriptions, it learns a representation robustly purged of non-causal information. Second, Test-Time Causal Intervention (TCI) provides an interactive mechanism where an LLM interprets a clinician's natural language command to modulate the segmentation decoder's features in real-time, enabling targeted error correction. We conduct an extensive empirical evaluation on a composite benchmark from four public datasets (BTCV, CHAOS, AMOS, BraTS), assessing generalization under cross-scanner, cross-modality, and cross-anatomy settings. Causal-SAM-LLM establishes a new state of the art in out-of-distribution (OOD) robustness, improving the average Dice score by up to 6.2 points and reducing the Hausdorff Distance by 15.8 mm over the strongest baseline, all while using less than 9% of the full model's trainable parameters. Our work charts a new course for building robust, efficient, and interactively controllable medical AI systems.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA

Revolutionizing medical imaging: A cutting-edge AI framework with vision transformers and perceiver IO for multi-disease diagnosis.

Khaliq A, Ahmad F, Rehman HU, Alanazi SA, Haleem H, Junaid K, Andrikopoulou E

•papers•Jul 4 2025

The integration of artificial intelligence in medical image classification has significantly advanced disease detection. However, traditional deep learning models face persistent challenges, including poor generalizability, high false-positive rates, and difficulties in distinguishing overlapping anatomical features, limiting their clinical utility. To address these limitations, this study proposes a hybrid framework combining Vision Transformers (ViT) and Perceiver IO, designed to enhance multi-disease classification accuracy. Vision Transformers leverage self-attention mechanisms to capture global dependencies in medical images, while Perceiver IO optimizes feature extraction for computational efficiency and precision. The framework is evaluated across three critical clinical domains: neurological disorders, including Stroke (tested on the Brain Stroke Prediction CT Scan Image Dataset) and Alzheimer's (analyzed via the Best Alzheimer MRI Dataset); skin diseases, covering Tinea (trained on the Skin Diseases Dataset) and Melanoma (augmented with dermoscopic images from the HAM10000/HAM10k dataset); and lung diseases, focusing on Lung Cancer (using the Lung Cancer Image Dataset) and Pneumonia (evaluated with the Pneumonia Dataset containing bacterial, viral, and normal X-ray cases). For neurological disorders, the model achieved 0.99 accuracy, 0.99 precision, 1.00 recall, 0.99 F1-score, demonstrating robust detection of structural brain abnormalities. In skin disease classification, it attained 0.95 accuracy, 0.93 precision, 0.97 recall, 0.95 F1-score, highlighting its ability to differentiate fine-grained textural patterns in lesions. For lung diseases, the framework achieved 0.98 accuracy, 0.97 precision, 1.00 recall, 0.98 F1-score, confirming its efficacy in identifying respiratory conditions. To bridge research and clinical practice, an AI-powered chatbot was developed for real-time analysis, enabling users to upload MRI, X-ray, or skin images for automated diagnosis with confidence scores and interpretable insights. This work represents the first application of ViT and Perceiver IO for these disease categories, outperforming conventional architectures in accuracy, computational efficiency, and clinical interpretability. The framework holds significant potential for early disease detection in healthcare settings, reducing diagnostic errors, and improving treatment outcomes for clinicians, radiologists, and patients. By addressing critical limitations of traditional models, such as overlapping feature confusion and false positives, this research advances the deployment of reliable AI tools in neurology, dermatology, and pulmonology.

Mixed Modality Classification Methodology In Silico Academic Lab Breakthrough

Medical slice transformer for improved diagnosis and explainability on 3D medical images with DINOv2.

Müller-Franzes G, Khader F, Siepmann R, Han T, Kather JN, Nebelung S, Truhn D

•papers•Jul 4 2025

Magnetic Resonance Imaging (MRI) and Computed Tomography (CT) are essential clinical cross-sectional imaging techniques for diagnosing complex conditions. However, large 3D datasets with annotations for deep learning are scarce. While methods like DINOv2 are encouraging for 2D image analysis, these methods have not been applied to 3D medical images. Furthermore, deep learning models often lack explainability due to their "black-box" nature. This study aims to extend 2D self-supervised models, specifically DINOv2, to 3D medical imaging while evaluating their potential for explainable outcomes. We introduce the Medical Slice Transformer (MST) framework to adapt 2D self-supervised models for 3D medical image analysis. MST combines a Transformer architecture with a 2D feature extractor, i.e., DINOv2. We evaluate its diagnostic performance against a 3D convolutional neural network (3D ResNet) across three clinical datasets: breast MRI (651 patients), chest CT (722 patients), and knee MRI (1199 patients). Both methods were tested for diagnosing breast cancer, predicting lung nodule dignity, and detecting meniscus tears. Diagnostic performance was assessed by calculating the Area Under the Receiver Operating Characteristic Curve (AUC). Explainability was evaluated through a radiologist's qualitative comparison of saliency maps based on slice and lesion correctness. P-values were calculated using Delong's test. MST achieved higher AUC values compared to ResNet across all three datasets: breast (0.94 ± 0.01 vs. 0.91 ± 0.02, P = 0.02), chest (0.95 ± 0.01 vs. 0.92 ± 0.02, P = 0.13), and knee (0.85 ± 0.04 vs. 0.69 ± 0.05, P = 0.001). Saliency maps were consistently more precise and anatomically correct for MST than for ResNet. Self-supervised 2D models like DINOv2 can be effectively adapted for 3D medical imaging using MST, offering enhanced diagnostic accuracy and explainability compared to convolutional neural networks.

Mixed Modality Classification Retrospective Clinical In Silico Academic Lab Breakthrough

SAMed-2: Selective Memory Enhanced Medical Segment Anything Model

Zhiling Yan, Sifan Song, Dingjie Song, Yiwei Li, Rong Zhou, Weixiang Sun, Zhennong Chen, Sekeun Kim, Hui Ren, Tianming Liu, Quanzheng Li, Xiang Li, Lifang He, Lichao Sun

•preprint•Jul 4 2025

Recent "segment anything" efforts show promise by learning from large-scale data, but adapting such models directly to medical images remains challenging due to the complexity of medical data, noisy annotations, and continual learning requirements across diverse modalities and anatomical structures. In this work, we propose SAMed-2, a new foundation model for medical image segmentation built upon the SAM-2 architecture. Specifically, we introduce a temporal adapter into the image encoder to capture image correlations and a confidence-driven memory mechanism to store high-certainty features for later retrieval. This memory-based strategy counters the pervasive noise in large-scale medical datasets and mitigates catastrophic forgetting when encountering new tasks or modalities. To train and evaluate SAMed-2, we curate MedBank-100k, a comprehensive dataset spanning seven imaging modalities and 21 medical segmentation tasks. Our experiments on both internal benchmarks and 10 external datasets demonstrate superior performance over state-of-the-art baselines in multi-task scenarios. The code is available at: https://github.com/ZhilingYan/Medical-SAM-Bench.

Mixed Modality Segmentation Methodology In Silico Academic Lab Open Code Open Dataset Benchmark SOTA

Multi-modality radiomics diagnosis of breast cancer based on MRI, ultrasound and mammography.

Wu J, Li Y, Gong W, Li Q, Han X, Zhang T

•papers•Jul 4 2025

To develop a multi-modality machine learning-based radiomics model utilizing Magnetic Resonance Imaging (MRI), Ultrasound (US), and Mammography (MMG) for the differentiation of benign and malignant breast nodules. This study retrospectively collected data from 204 patients across three hospitals, including MRI, US, and MMG imaging data along with confirmed pathological diagnoses. Lesions on 2D US, 2D MMG, and 3D MRI images were selected to outline the areas of interest, which were then automatically expanded outward by 3 mm, 5 mm, and 8 mm to extract radiomic features within and around the tumor. ANOVA, the maximum correlation minimum redundancy (mRMR) algorithm, and the least absolute shrinkage and selection operator (LASSO) were used to select features for breast cancer diagnosis through logistic regression analysis. The performance of the radiomics models was evaluated using receiver operating characteristic (ROC) curve analysis, curves decision curve analysis (DCA), and calibration curves. Among the various radiomics models tested, the MRI_US_MMG multi-modality logistic regression model with 5 mm peritumoral features demonstrated the best performance. In the test cohort, this model achieved an AUC of 0.905(95% confidence interval [CI]: 0.805-1). These results suggest that the inclusion of peritumoral features, specifically at a 5 mm expansion, significantly enhanced the diagnostic efficiency of the multi-modality radiomics model in differentiating benign from malignant breast nodules. The multi-modality radiomics model based on MRI, ultrasound, and mammography can predict benign and malignant breast lesions.

Mixed Modality Classification Breast Retrospective Clinical In Silico Academic Lab

Filter Papers

Tags

Towards Reliable Healthcare Imaging: A Multifaceted Approach in Class Imbalance Handling for Medical Image Segmentation.

Early warning and stratification of the elderly cardiopulmonary dysfunction-related diseases: multicentre prospective study protocol.

A novel recursive transformer-based U-Net architecture for enhanced multi-scale medical image segmentation.

DHR-Net: Dynamic Harmonized registration network for multimodal medical images.

Machine learning approach using radiomics features to distinguish odontogenic cysts and tumours.

Causal-SAM-LLM: Large Language Models as Causal Reasoners for Robust Medical Segmentation

Revolutionizing medical imaging: A cutting-edge AI framework with vision transformers and perceiver IO for multi-disease diagnosis.

Medical slice transformer for improved diagnosis and explainability on 3D medical images with DINOv2.

SAMed-2: Selective Memory Enhanced Medical Segment Anything Model

Multi-modality radiomics diagnosis of breast cancer based on MRI, ultrasound and mammography.

Ready to Sharpen Your Edge?