Latest Papers on Radiology AI. Tags: Mixed Modality, Order: Best Match, Limit: 10.

MedSG-Bench: A Benchmark for Medical Image Sequences Grounding

Jingkun Yue, Siqi Zhang, Zinan Jia, Huihuan Xu, Zongbo Han, Xiaohong Liu, Guangyu Wang

•preprint•May 17 2025

Visual grounding is essential for precise perception and reasoning in multimodal large language models (MLLMs), especially in medical imaging domains. While existing medical visual grounding benchmarks primarily focus on single-image scenarios, real-world clinical applications often involve sequential images, where accurate lesion localization across different modalities and temporal tracking of disease progression (e.g., pre- vs. post-treatment comparison) require fine-grained cross-image semantic alignment and context-aware reasoning. To remedy the underrepresentation of image sequences in existing medical visual grounding benchmarks, we propose MedSG-Bench, the first benchmark tailored for Medical Image Sequences Grounding. It comprises eight VQA-style tasks, formulated into two paradigms of the grounding tasks, including 1) Image Difference Grounding, which focuses on detecting change regions across images, and 2) Image Consistency Grounding, which emphasizes detection of consistent or shared semantics across sequential images. MedSG-Bench covers 76 public datasets, 10 medical imaging modalities, and a wide spectrum of anatomical structures and diseases, totaling 9,630 question-answer pairs. We benchmark both general-purpose MLLMs (e.g., Qwen2.5-VL) and medical-domain specialized MLLMs (e.g., HuatuoGPT-vision), observing that even the advanced models exhibit substantial limitations in medical sequential grounding tasks. To advance this field, we construct MedSG-188K, a large-scale instruction-tuning dataset tailored for sequential visual grounding, and further develop MedSeq-Grounder, an MLLM designed to facilitate future research on fine-grained understanding across medical sequential images. The benchmark, dataset, and model are available at https://huggingface.co/MedSG-Bench

Mixed Modality Detection Whole Body Dataset Release In Silico None Academic Lab Open Dataset Open Code GenAI

MedVKAN: Efficient Feature Extraction with Mamba and KAN for Medical Image Segmentation

Hancan Zhu, Jinhao Chen, Guanghua He

•preprint•May 17 2025

Medical image segmentation relies heavily on convolutional neural networks (CNNs) and Transformer-based models. However, CNNs are constrained by limited receptive fields, while Transformers suffer from scalability challenges due to their quadratic computational complexity. To address these limitations, recent advances have explored alternative architectures. The state-space model Mamba offers near-linear complexity while capturing long-range dependencies, and the Kolmogorov-Arnold Network (KAN) enhances nonlinear expressiveness by replacing fixed activation functions with learnable ones. Building on these strengths, we propose MedVKAN, an efficient feature extraction model integrating Mamba and KAN. Specifically, we introduce the EFC-KAN module, which enhances KAN with convolutional operations to improve local pixel interaction. We further design the VKAN module, integrating Mamba with EFC-KAN as a replacement for Transformer modules, significantly improving feature extraction. Extensive experiments on five public medical image segmentation datasets show that MedVKAN achieves state-of-the-art performance on four datasets and ranks second on the remaining one. These results validate the potential of Mamba and KAN for medical image segmentation while introducing an innovative and computationally efficient feature extraction framework. The code is available at: https://github.com/beginner-cjh/MedVKAN.

Mixed Modality Segmentation Other Methodology In Silico Benchmark SOTA Open Code

Computational modeling of breast tissue mechanics and machine learning in cancer diagnostics: enhancing precision in risk prediction and therapeutic strategies.

Ashi L, Taurin S

•papers•May 17 2025

Breast cancer remains a significant global health issue. Despite advances in detection and treatment, its complexity is driven by genetic, environmental, and structural factors. Computational methods like Finite Element Modeling (FEM) have transformed our understanding of breast cancer risk and progression. Advanced computational approaches in breast cancer research are the focus, with an emphasis on FEM's role in simulating breast tissue mechanics and enhancing precision in therapies such as radiofrequency ablation (RFA). Machine learning (ML), particularly Convolutional Neural Networks (CNNs), has revolutionized imaging modalities like mammograms and MRIs, improving diagnostic accuracy and early detection. AI applications in analyzing histopathological images have advanced tumor classification and grading, offering consistency and reducing inter-observer variability. Explainability tools like Grad-CAM, SHAP, and LIME enhance the transparency of AI-driven models, facilitating their integration into clinical workflows. Integrating FEM and ML represents a paradigm shift in breast cancer management. FEM offers precise modeling of tissue mechanics, while ML excels in predictive analytics and image analysis. Despite challenges such as data variability and limited standardization, synergizing these approaches promises adaptive, personalized care. These computational methods have the potential to redefine diagnostics, optimize treatment, and improve patient outcomes.

Mixed Modality Classification Breast Review Concept Academic Lab Ethics

Diagnostic challenges of carpal tunnel syndrome in patients with congenital thenar hypoplasia: a comprehensive review.

Naghizadeh H, Salkhori O, Akrami S, Khabiri SS, Arabzadeh A

•papers•May 16 2025

Carpal Tunnel Syndrome (CTS) is the most common entrapment neuropathy, frequently presenting with pain, numbness, and muscle weakness due to median nerve compression. However, diagnosing CTS becomes particularly challenging in patients with Congenital Thenar Hypoplasia (CTH), a rare congenital anomaly characterized by underdeveloped thenar muscles. The overlapping symptoms of CTH and CTS, such as thumb weakness, impaired hand function, and thenar muscle atrophy, can obscure the identification of median nerve compression. This review highlights the diagnostic complexities arising from this overlap and evaluates existing clinical, imaging, and electrophysiological assessment methods. While traditional diagnostic tests, including Phalen's and Tinel's signs, exhibit limited sensitivity in CTH patients, advanced imaging modalities like ultrasonography (US), magnetic resonance imaging (MRI), and diffusion tensor imaging (DTI) provide valuable insights into structural abnormalities. Additionally, emerging technologies such as artificial intelligence (AI) enhance diagnostic precision by automating imaging analysis and identifying subtle nerve alterations. Combining clinical history, functional assessments, and advanced imaging, an interdisciplinary approach is critical to differentiate between CTH-related anomalies and CTS accurately. This comprehensive review underscores the need for tailored diagnostic protocols to improve early detection, personalised management, and outcomes for this unique patient population.

Mixed Modality Classification Musculoskeletal Review Concept Academic Lab

Uncertainty quantification for deep learning-based metastatic lesion segmentation on whole body PET/CT.

Schott B, Santoro-Fernandes V, Klanecek Z, Perlman S, Jeraj R

•papers•May 16 2025

Deep learning models are increasingly being implemented for automated medical image analysis to inform patient care. Most models, however, lack uncertainty information, without which the reliability of model outputs cannot be ensured. Several uncertainty quantification (UQ) methods exist to capture model uncertainty. Yet, it is not clear which method is optimal for a given task. The purpose of this work was to investigate several commonly used UQ methods for the critical yet understudied task of metastatic lesion segmentation on whole body PET/CT. Approach:59 whole body 68Ga-DOTATATE PET/CT images of patients undergoing theranostic treatment of metastatic neuroendocrine tumors were used in this work. A 3D U-Net was trained for lesion segmentation following five-fold cross validation. Uncertainty measures derived from four UQ methods-probability entropy, Monte Carlo dropout, deep ensembles, and test time augmentation-were investigated. Each uncertainty measure was assessed across four quantitative evaluations: (1) its ability to detect artificially degraded image data at low, medium, and high degradation magnitudes; (2) to detect false-positive (FP) predicted regions; (3) to recover false-negative (FN) predicted regions; and (3) to establish correlations with model biomarker extraction and segmentation performance metrics. Results: Test time augmentation and probability entropy respectively achieved the highest and lowest degraded image detection at low (AUC=0.54 vs. 0.68), medium (AUC=0.70 vs. 0.82), and high (AUC=0.83 vs. 0.90) degradation magnitudes. For detecting FPs, all UQ methods achieve strong performance, with AUC values ranging narrowly between 0.77 and 0.81. FN region recovery performance was strongest for test time augmentation and weakest for probability entropy. Performance for the correlation analysis was mixed, where the strongest performance was achieved by test time augmentation for SUVtotal capture (ρ=0.57) and segmentation Dice coefficient (ρ=0.72), by Monte Carlo dropout for SUVmean capture (ρ=0.35), and by probability entropy for segmentation cross entropy (ρ=0.96).Significance: Overall, test time augmentation demonstrated superior uncertainty quantification performance and is recommended for use in metastatic lesion segmentation task. It also offers the advantage of being post hoc and computationally efficient. In contrast, probability entropy performed the worst, highlighting the need for advanced UQ approaches for this task.&#xD.

Mixed Modality Segmentation Whole Body Retrospective Clinical In Silico Academic Lab

Deep learning predicts HER2 status in invasive breast cancer from multimodal ultrasound and MRI.

Fan Y, Sun K, Xiao Y, Zhong P, Meng Y, Yang Y, Du Z, Fang J

•papers•May 16 2025

The preoperative human epidermal growth factor receptor type 2 (HER2) status of breast cancer is typically determined by pathological examination of a core needle biopsy, which influences the efficacy of neoadjuvant chemotherapy (NAC). However, the highly heterogeneous nature of breast cancer and the limitations of needle aspiration biopsy increase the instability of pathological evaluation. The aim of this study was to predict HER2 status in preoperative breast cancer using deep learning (DL) models based on ultrasound (US) and magnetic resonance imaging (MRI). The study included women with invasive breast cancer who underwent US and MRI at our institution between January 2021 and July 2024. US images and dynamic contrast-enhanced T1-weighted MRI images were used to construct DL models (DL-US: the DL model based on US; DL-MRI: the model based on MRI; and DL-MRI&US: the combined model based on both MRI and US). All classifications were based on postoperative pathological evaluation. Receiver operating characteristic analysis and the DeLong test were used to compare the diagnostic performance of the DL models. In the test cohort, DL-US differentiated the HER2 status of breast cancer with an AUC of 0.842 (95% CI: 0.708-0.931), and sensitivity and specificity of 89.5% and 79.3%, respectively. DL-MRI achieved an AUC of 0.800 (95% CI: 0.660-0.902), with sensitivity and specificity of 78.9% and 79.3%, respectively. DL-MRI&US yielded an AUC of 0.898 (95% CI: 0.777-0.967), with sensitivity and specificity of 63.2% and 100.0%, respectively.

Mixed Modality Classification Breast Retrospective Clinical In Silico Academic Lab

A monocular endoscopic image depth estimation method based on a window-adaptive asymmetric dual-branch Siamese network.

Chong N, Yang F, Wei K

•papers•May 15 2025

Minimally invasive surgery involves entering the body through small incisions or natural orifices, using a medical endoscope for observation and clinical procedures. However, traditional endoscopic images often suffer from low texture and uneven illumination, which can negatively impact surgical and diagnostic outcomes. To address these challenges, many researchers have applied deep learning methods to enhance the processing of endoscopic images. This paper proposes a monocular medical endoscopic image depth estimation method based on a window-adaptive asymmetric dual-branch Siamese network. In this network, one branch focuses on processing global image information, while the other branch concentrates on local details. An improved lightweight Squeeze-and-Excitation (SE) module is added to the final layer of each branch, dynamically adjusting the inter-channel weights through self-attention. The outputs from both branches are then integrated using a lightweight cross-attention feature fusion module, enabling cross-branch feature interaction and enhancing the overall feature representation capability of the network. Extensive ablation and comparative experiments were conducted on medical datasets (EAD2019, Hamlyn, M2caiSeg, UCL) and a non-medical dataset (NYUDepthV2), with both qualitative and quantitative results-measured in terms of RMSE, AbsRel, FLOPs and running time-demonstrating the superiority of the proposed model. Additionally, comparisons with CT images show good organ boundary matching capability, highlighting the potential of our method for clinical applications. The key code of this paper is available at: https://github.com/superchongcnn/AttenAdapt_DE .

Mixed Modality Segmentation Abdominal Methodology In Silico Academic Lab Open Code

Accuracy and Reliability of Multimodal Imaging in Diagnosing Knee Sports Injuries.

Zhu D, Zhang Z, Li W

•papers•May 15 2025

Due to differences in subjective experience and professional level among doctors, as well as inconsistent diagnostic criteria, there are issues with the accuracy and reliability of single imaging diagnosis results for knee joint injuries. To address these issues, magnetic resonance imaging (MRI), computed tomography (CT) and ultrasound (US) are adopted in this article for ensemble learning, and deep learning (DL) is combined for automatic analysis. By steps such as image enhancement, noise elimination, and tissue segmentation, the quality of image data is improved, and then convolutional neural networks (CNN) are used to automatically identify and classify injury types. The experimental results show that the DL model exhibits high sensitivity and specificity in the diagnosis of different types of injuries, such as anterior cruciate ligament tear, meniscus injury, cartilage injury, and fracture. The diagnostic accuracy of anterior cruciate ligament tear exceeds 90%, and the highest diagnostic accuracy of cartilage injury reaches 95.80%. In addition, compared with traditional manual image interpretation, the DL model has significant advantages in time efficiency, with a significant reduction in average interpretation time per case. The diagnostic consistency experiment shows that the DL model has high consistency with doctors' diagnosis results, with an overall error rate of less than 2%. The model has high accuracy and strong generalization ability when dealing with different types of joint injuries. These data indicate that combining multiple imaging technologies and the DL algorithm can effectively improve the accuracy and efficiency of diagnosing sports injuries of knee joints.

Mixed Modality Classification Musculoskeletal Retrospective Clinical In Silico Academic Lab

Performance of Artificial Intelligence in Diagnosing Lumbar Spinal Stenosis: A Systematic Review and Meta-Analysis.

Yang X, Zhang Y, Li Y, Wu Z

•papers•May 15 2025

The present study followed the reporting guidelines for systematic reviews and meta-analyses. We conducted this study to review the diagnostic value of artificial intelligence (AI) for various types of lumbar spinal stenosis (LSS) and the level of stenosis, offering evidence-based support for the development of smart diagnostic tools. AI is currently being utilized for image processing in clinical practice. Some studies have explored AI techniques for identifying the severity of LSS in recent years. Nevertheless, there remains a shortage of structured data proving its effectiveness. Four databases (PubMed, Cochrane, Embase, and Web of Science) were searched until March 2024, including original studies that utilized deep learning (DL) and machine learning (ML) models to diagnose LSS. The risk of bias of included studies was assessed using Quality Assessment of Diagnostic Accuracy Studies is a quality evaluation tool for diagnostic research (diagnostic tests). Computed Tomography. PROSPERO is an international database of prospectively registered systematic reviews. Summary Receiver Operating Characteristic. Magnetic Resonance. Central canal stenosis. three-dimensional magnetic resonance myelography. The accuracy in the validation set was extracted for a meta-analysis. The meta-analysis was completed in R4.4.0. A total of 48 articles were included, with an overall accuracy of 0.885 (95% CI: 0.860-0907) for dichotomous tasks. Among them, the accuracy was 0.892 (95% CI: 0.867-0915) for DL and 0.833 (95% CI: 0.760-0895) for ML. The overall accuracy for LSS was 0.895 (95% CI: 0.858-0927), with an accuracy of 0.912 (95% CI: 0.873-0.944) for DL and 0.843 (95% CI: 0.766-0.907) for ML. The overall accuracy for central canal stenosis was 0.875 (95% CI: 0.821-0920), with an accuracy of 0.881 (95% CI: 0.829-0.925) for DL and 0.733 (95% CI: 0.541-0.877) for ML. The overall accuracy for neural foramen stenosis was 0.893 (95% CI: 0.851-0.928). In polytomous tasks, the accuracy was 0.936 (95% CI: 0.895-0.967) for no LSS, 0.503 (95% CI: 0.391-0.614) for mild LSS, 0.512 (95% CI: 0.336-0.688) for moderate LSS, and 0.860 for severe LSS (95% CI: 0.733-0.954). AI is highly valuable for diagnosing LSS. However, further external validation is necessary to enhance the analysis of different stenosis categories and improve the diagnostic accuracy for mild to moderate stenosis levels.

Mixed Modality Classification Musculoskeletal Meta Analysis In Silico Academic Lab

A Deep-Learning Framework for Ovarian Cancer Subtype Classification Using Whole Slide Images.

Wang C, Yi Q, Aflakian A, Ye J, Arvanitis T, Dearn KD, Hajiyavand A

•papers•May 15 2025

Ovarian cancer, a leading cause of cancer-related deaths among women, comprises distinct subtypes each requiring different treatment approaches. This paper presents a deep-learning framework for classifying ovarian cancer subtypes using Whole Slide Imaging (WSI). Our method contains three stages: image tiling, feature extraction, and multi-instance learning. Our approach is trained and validated on a public dataset from 80 distinct patients, achieving up to 89,8% accuracy with a notable improvement in computational efficiency. The results demonstrate the potential of our framework to augment diagnostic precision in clinical settings, offering a scalable solution for the accurate classification of ovarian cancer subtypes.

Mixed Modality Classification Abdominal Retrospective Clinical In Silico Academic Lab

MedSG-Bench: A Benchmark for Medical Image Sequences Grounding

MedVKAN: Efficient Feature Extraction with Mamba and KAN for Medical Image Segmentation

Computational modeling of breast tissue mechanics and machine learning in cancer diagnostics: enhancing precision in risk prediction and therapeutic strategies.

Diagnostic challenges of carpal tunnel syndrome in patients with congenital thenar hypoplasia: a comprehensive review.

Uncertainty quantification for deep learning-based metastatic lesion segmentation on whole body PET/CT.

Deep learning predicts HER2 status in invasive breast cancer from multimodal ultrasound and MRI.

A monocular endoscopic image depth estimation method based on a window-adaptive asymmetric dual-branch Siamese network.

Accuracy and Reliability of Multimodal Imaging in Diagnosing Knee Sports Injuries.

Performance of Artificial Intelligence in Diagnosing Lumbar Spinal Stenosis: A Systematic Review and Meta-Analysis.

A Deep-Learning Framework for Ovarian Cancer Subtype Classification Using Whole Slide Images.

Ready to Sharpen Your Edge?