Latest Papers on Radiology AI. Tags: Benchmark SOTA

Optimized YOLOv8 for enhanced breast tumor segmentation in ultrasound imaging.

Mostafa AM, Alaerjan AS, Aldughayfiq B, Allahem H, Mahmoud AA, Said W, Shabana H, Ezz M

•papers•Jun 19 2025

Breast cancer significantly affects people's health globally, making early and accurate diagnosis vital. While ultrasound imaging is safe and non-invasive, its manual interpretation is subjective. This study explores machine learning (ML) techniques to improve breast ultrasound image segmentation, comparing models trained on combined versus separate classes of benign and malignant tumors. The YOLOv8 object detection algorithm is applied to the image segmentation task, aiming to capitalize on its robust feature detection capabilities. We utilized a dataset of 780 ultrasound images categorized into benign and malignant classes to train several deep learning (DL) models: UNet, UNet with DenseNet-121, VGG16, VGG19, and an adapted YOLOv8. These models were evaluated in two experimental setups-training on a combined dataset and training on separate datasets for benign and malignant classes. Performance metrics such as Dice Coefficient, Intersection over Union (IoU), and mean Average Precision (mAP) were used to assess model effectiveness. The study demonstrated substantial improvements in model performance when trained on separate classes, with the UNet model's F1-score increasing from 77.80 to 84.09% and Dice Coefficient from 75.58 to 81.17%, and the adapted YOLOv8 model achieving an F1-score improvement from 93.44 to 95.29% and Dice Coefficient from 82.10 to 84.40%. These results highlight the advantage of specialized model training and the potential of using advanced object detection algorithms for segmentation tasks. This research underscores the significant potential of using specialized training strategies and innovative model adaptations in medical imaging segmentation, ultimately contributing to better patient outcomes.

Ultrasound Segmentation Breast Methodology In Silico Academic Lab Benchmark SOTA

Multitask Deep Learning for Automated Segmentation and Prognostic Stratification of Endometrial Cancer via Biparametric MRI.

Yan R, Zhang X, Cao Q, Xu J, Chen Y, Qin S, Zhang S, Zhao W, Xing X, Yang W, Lang N

•papers•Jun 19 2025

Endometrial cancer (EC) is a common gynecologic malignancy; accurate assessment of key prognostic factors is important for treatment planning. To develop a deep learning (DL) framework based on biparametric MRI for automated segmentation and multitask classification of EC key prognostic factors, including grade, stage, histological subtype, lymphovascular space invasion (LVSI), and deep myometrial invasion (DMI). Retrospective. A total of 325 patients with histologically confirmed EC were included: 211 training, 54 validation, and 60 test cases. T2-weighted imaging (T2WI, FSE/TSE) and diffusion-weighted imaging (DWI, SS-EPI) sequences at 1.5 and 3 T. The DL model comprised tumor segmentation and multitask classification. Manual delineation on T2WI and DWI acted as the reference standard for segmentation. Separate models were trained using T2WI alone, DWI alone and combined T2WI + DWI to classify dichotomized key prognostic factors. Performance was assessed in validation and test cohorts. For DMI, the combined model's was compared with visual assessment by four radiologists (with 1, 4, 7, and 20 years' experience), each of whom independently reviewed all cases. Segmentation was evaluated using the dice similarity coefficient (DSC), Jaccard similarity coefficient (JSC), Hausdorff distance (HD95), and average surface distance (ASD). Classification performance was assessed using area under the receiver operating characteristic curve (AUC). Model AUCs were compared using DeLong's test. p < 0.05 was considered significant. In the test cohort, DSCs were 0.80 (T2WI) and 0.78 (DWI) and JSCs were 0.69 for both. HD95 and ASD were 7.02/1.71 mm (T2WI) versus 10.58/2.13 mm (DWI). The classification framework achieved AUCs of 0.78-0.94 (validation) and 0.74-0.94 (test). For DMI, the combined model performed comparably to radiologists (p = 0.07-0.84). The unified DL framework demonstrates strong EC segmentation and classification performance, with high accuracy across multiple tasks. 3. Stage 3.

MRI Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

A fusion-based deep-learning algorithm predicts PDAC metastasis based on primary tumour CT images: a multinational study.

Xue N, Sabroso-Lasa S, Merino X, Munzo-Beltran M, Schuurmans M, Olano M, Estudillo L, Ledesma-Carbayo MJ, Liu J, Fan R, Hermans JJ, van Eijck C, Malats N

•papers•Jun 19 2025

Diagnosing the presence of metastasis of pancreatic cancer is pivotal for patient management and treatment, with contrast-enhanced CT scans (CECT) as the cornerstone of diagnostic evaluation. However, this diagnostic modality requires a multifaceted approach. To develop a convolutional neural network (CNN)-based model (PMPD, Pancreatic cancer Metastasis Prediction Deep-learning algorithm) to predict the presence of metastases based on CECT images of the primary tumour. CECT images in the portal venous phase of 335 patients with pancreatic ductal adenocarcinoma (PDAC) from the PanGenEU study and The First Affiliated Hospital of Zhengzhou University (ZZU) were randomly divided into training and internal validation sets by applying fivefold cross-validation. Two independent external validation datasets of 143 patients from the Radboud University Medical Center (RUMC), included in the PANCAIM study (RUMC-PANCAIM) and 183 patients from the PREOPANC trial of the Dutch Pancreatic Cancer Group (PREOPANC-DPCG) were used to evaluate the results. The area under the receiver operating characteristic curve (AUROC) for the internally tested model was 0.895 (0.853-0.937) and 0.779 (0.741-0.817) in the PanGenEU and ZZU sets, respectively. In the external validation sets, the mean AUROC was 0.806 (0.787-0.826) for the RUMC-PANCAIM and 0.761 (0.717-0.804) for the PREOPANC-DPCG. When stratified by the different metastasis sites, the PMPD model achieved the average AUROC between 0.901-0.927 in PanGenEU, 0.782-0.807 in ZZU and 0.761-0.820 in PREOPANC-DPCG sets. A PMPD-derived Metastasis Risk Score (MRS) (HR: 2.77, 95% CI 1.99 to 3.86, p=1.59e-09) outperformed the Resectability status from the National Comprehensive Cancer Network guideline and the CA19-9 biomarker in predicting overall survival. Meanwhile, the MRS could potentially predict developed metastasis (AUROC: 0.716 for within 3 months, 0.645 for within 6 months). This study represents a pioneering utilisation of a high-performance deep-learning model to predict extrapancreatic organ metastasis in patients with PDAC.

CT Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Implicit neural representations for accurate estimation of the standard model of white matter

Tom Hendriks, Gerrit Arends, Edwin Versteeg, Anna Vilanova, Maxime Chamberland, Chantal M. W. Tax

•preprint•Jun 18 2025

Diffusion magnetic resonance imaging (dMRI) enables non-invasive investigation of tissue microstructure. The Standard Model (SM) of white matter aims to disentangle dMRI signal contributions from intra- and extra-axonal water compartments. However, due to the model its high-dimensional nature, extensive acquisition protocols with multiple b-values and diffusion tensor shapes are typically required to mitigate parameter degeneracies. Even then, accurate estimation remains challenging due to noise. This work introduces a novel estimation framework based on implicit neural representations (INRs), which incorporate spatial regularization through the sinusoidal encoding of the input coordinates. The INR method is evaluated on both synthetic and in vivo datasets and compared to parameter estimates using cubic polynomials, supervised neural networks, and nonlinear least squares. Results demonstrate superior accuracy of the INR method in estimating SM parameters, particularly in low signal-to-noise conditions. Additionally, spatial upsampling of the INR can represent the underlying dataset anatomically plausibly in a continuous way, which is unattainable with linear or cubic interpolation. The INR is fully unsupervised, eliminating the need for labeled training data. It achieves fast inference ($\sim$6 minutes), is robust to both Gaussian and Rician noise, supports joint estimation of SM kernel parameters and the fiber orientation distribution function with spherical harmonics orders up to at least 8 and non-negativity constraints, and accommodates spatially varying acquisition protocols caused by magnetic gradient non-uniformities. The combination of these properties along with the possibility to easily adapt the framework to other dMRI models, positions INRs as a potentially important tool for analyzing and interpreting diffusion MRI data.

MRI Reconstruction Neurological Methodology In Silico Academic Lab Benchmark SOTA GenAI

D2Diff : A Dual Domain Diffusion Model for Accurate Multi-Contrast MRI Synthesis

Sanuwani Dayarathna, Himashi Peiris, Kh Tohidul Islam, Tien-Tsin Wong, Zhaolin Chen

•preprint•Jun 18 2025

Multi contrast MRI synthesis is inherently challenging due to the complex and nonlinear relationships among different contrasts. Each MRI contrast highlights unique tissue properties, but their complementary information is difficult to exploit due to variations in intensity distributions and contrast specific textures. Existing methods for multi contrast MRI synthesis primarily utilize spatial domain features, which capture localized anatomical structures but struggle to model global intensity variations and distributed patterns. Conversely, frequency domain features provide structured inter contrast correlations but lack spatial precision, limiting their ability to retain finer details. To address this, we propose a dual domain learning framework that integrates spatial and frequency domain information across multiple MRI contrasts for enhanced synthesis. Our method employs two mutually trained denoising networks, one conditioned on spatial domain and the other on frequency domain contrast features through a shared critic network. Additionally, an uncertainty driven mask loss directs the models focus toward more critical regions, further improving synthesis accuracy. Extensive experiments show that our method outperforms SOTA baselines, and the downstream segmentation performance highlights the diagnostic value of the synthetic results.

MRI Image Synthesis Methodology In Silico Academic Lab Benchmark SOTA

Multimodal Large Language Models for Medical Report Generation via Customized Prompt Tuning

Chunlei Li, Jingyang Hou, Yilei Shi, Jingliang Hu, Xiao Xiang Zhu, Lichao Mou

•preprint•Jun 18 2025

Medical report generation from imaging data remains a challenging task in clinical practice. While large language models (LLMs) show great promise in addressing this challenge, their effective integration with medical imaging data still deserves in-depth exploration. In this paper, we present MRG-LLM, a novel multimodal large language model (MLLM) that combines a frozen LLM with a learnable visual encoder and introduces a dynamic prompt customization mechanism. Our key innovation lies in generating instance-specific prompts tailored to individual medical images through conditional affine transformations derived from visual features. We propose two implementations: prompt-wise and promptbook-wise customization, enabling precise and targeted report generation. Extensive experiments on IU X-ray and MIMIC-CXR datasets demonstrate that MRG-LLM achieves state-of-the-art performance in medical report generation. Our code will be made publicly available.

X-Ray Report Generation Chest Methodology In Silico Academic Lab Open Code Benchmark SOTA

Applying a multi-task and multi-instance framework to predict axillary lymph node metastases in breast cancer.

Li Y, Chen Z, Ding Z, Mei D, Liu Z, Wang J, Tang K, Yi W, Xu Y, Liang Y, Cheng Y

•papers•Jun 18 2025

Deep learning (DL) models have shown promise in predicting axillary lymph node (ALN) status. However, most existing DL models were classification-only models and did not consider the practical application scenarios of multi-view joint prediction. Here, we propose a Multi-Task Learning (MTL) and Multi-Instance Learning (MIL) framework that simulates the real-world clinical diagnostic scenario for ALN status prediction in breast cancer. Ultrasound images of the primary tumor and ALN (if available) regions were collected, each annotated with a segmentation label. The model was trained on a training cohort and tested on both internal and external test cohorts. The proposed two-stage DL framework using one of the Transformer models, Segformer, as the network backbone, exhibits the top-performing model. It achieved an AUC of 0.832, a sensitivity of 0.815, and a specificity of 0.854 in the internal test cohort. In the external cohort, this model attained an AUC of 0.918, a sensitivity of 0.851 and a specificity of 0.957. The Class Activation Mapping method demonstrated that the DL model correctly identified the characteristic areas of metastasis within the primary tumor and ALN regions. This framework may serve as an effective second reader to assist clinicians in ALN status assessment.

Ultrasound Classification Breast Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Deep Learning-Based Adrenal Gland Volumetry for the Prediction of Diabetes.

Ku EJ, Yoon SH, Park SS, Yoon JW, Kim JH

•papers•Jun 18 2025

The long-term association between adrenal gland volume (AGV) and type 2 diabetes (T2D) remains unclear. We aimed to determine the association between deep learning-based AGV and current glycemic status and incident T2D. In this observational study, adults who underwent abdominopelvic computed tomography (CT) for health checkups (2011-2012), but had no adrenal nodules, were included. AGV was measured from CT images using a three-dimensional nnU-Net deep learning algorithm. We assessed the association between AGV and T2D using a cross-sectional and longitudinal design. We used 500 CT scans (median age, 52.3 years; 253 men) for model development and a Multi-Atlas Labeling Beyond the Cranial Vault dataset for external testing. A clinical cohort included a total of 9708 adults (median age, 52.0 years; 5,769 men). The deep learning model demonstrated a dice coefficient of 0.71±0.11 for adrenal segmentation and a mean volume difference of 0.6± 0.9 mL in the external dataset. Participants with T2D at baseline had a larger AGV than those without (7.3 cm3 vs. 6.7 cm3 and 6.3 cm3 vs. 5.5 cm3 for men and women, respectively, all P<0.05). The optimal AGV cutoff values for predicting T2D were 7.2 cm3 in men and 5.5 cm3 in women. Over a median 7.0-year follow-up, T2D developed in 938 participants. Cumulative T2D risk was accentuated with high AGV compared with low AGV (adjusted hazard ratio, 1.27; 95% confidence interval, 1.11 to 1.46). AGV, measured using deep learning algorithms, is associated with current glycemic status and can significantly predict the development of T2D.

CT Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Quality control system for patient positioning and filling in meta-information for chest X-ray examinations.

Borisov AA, Semenov SS, Kirpichev YS, Arzamasov KM, Omelyanskaya OV, Vladzymyrskyy AV, Vasilev YA

•papers•Jun 18 2025

During radiography, irregularities occur, leading to decrease in the diagnostic value of the images obtained. The purpose of this work was to develop a system for automated quality assurance of patient positioning in chest radiographs, with detection of suboptimal contrast, brightness, and metadata errors. The quality assurance system was trained and tested using more than 69,000 X-rays of the chest and other anatomical areas from the Unified Radiological Information Service (URIS) and several open datasets. Our dataset included studies regardless of a patient's gender and race, while the sole exclusion criterion being age below 18 years. A training dataset of radiographs labeled by expert radiologists was used to train an ensemble of modified deep convolutional neural networks architectures ResNet152V2 and VGG19 to identify various quality deficiencies. Model performance was accessed using area under the receiver operating characteristic curve (ROC-AUC), precision, recall, F1-score, and accuracy metrics. Seven neural network models were trained to classify radiographs by the following quality deficiencies: failure to capture the target anatomic region, chest rotation, suboptimal brightness, incorrect anatomical area, projection errors, and improper photometric interpretation. All metrics for each model exceed 95%, indicating high predictive value. All models were combined into a unified system for evaluating radiograph quality. The processing time per image is approximately 3 s. The system supports multiple use cases: integration into an automated radiographic workstations, external quality assurance system for radiology departments, acquisition quality audits for municipal health systems, and routing of studies to diagnostic AI models.

X-Ray Classification Chest Methodology In Silico Academic Lab Benchmark SOTA

EchoFM: Foundation Model for Generalizable Echocardiogram Analysis.

Kim S, Jin P, Song S, Chen C, Li Y, Ren H, Li X, Liu T, Li Q

•papers•Jun 18 2025

Echocardiography is the first-line noninvasive cardiac imaging modality, providing rich spatio-temporal information on cardiac anatomy and physiology. Recently, foundation model trained on extensive and diverse datasets has shown strong performance in various downstream tasks. However, translating foundation models into the medical imaging domain remains challenging due to domain differences between medical and natural images, the lack of diverse patient and disease datasets. In this paper, we introduce EchoFM, a general-purpose vision foundation model for echocardiography trained on a large-scale dataset of over 20 million echocardiographic images from 6,500 patients. To enable effective learning of rich spatio-temporal representations from periodic videos, we propose a novel self-supervised learning framework based on a masked autoencoder with a spatio-temporal consistent masking strategy and periodic-driven contrastive learning. The learned cardiac representations can be readily adapted and fine-tuned for a wide range of downstream tasks, serving as a strong and flexible backbone model. We validate EchoFM through experiments across key downstream tasks in the clinical echocardiography workflow, leveraging public and multi-center internal datasets. EchoFM consistently outperforms SOTA methods, demonstrating superior generalization capabilities and flexibility. The code and checkpoints are available at: https://github.com/SekeunKim/EchoFM.git.

Ultrasound Classification Cardiac Methodology In Silico Academic Lab Open Code Benchmark SOTA

Filter Papers

Tags

Optimized YOLOv8 for enhanced breast tumor segmentation in ultrasound imaging.

Multitask Deep Learning for Automated Segmentation and Prognostic Stratification of Endometrial Cancer via Biparametric MRI.

A fusion-based deep-learning algorithm predicts PDAC metastasis based on primary tumour CT images: a multinational study.

Implicit neural representations for accurate estimation of the standard model of white matter

D2Diff : A Dual Domain Diffusion Model for Accurate Multi-Contrast MRI Synthesis

Multimodal Large Language Models for Medical Report Generation via Customized Prompt Tuning

Applying a multi-task and multi-instance framework to predict axillary lymph node metastases in breast cancer.

Deep Learning-Based Adrenal Gland Volumetry for the Prediction of Diabetes.

Quality control system for patient positioning and filling in meta-information for chest X-ray examinations.

EchoFM: Foundation Model for Generalizable Echocardiogram Analysis.

Ready to Sharpen Your Edge?