Latest Papers on Radiology AI.

An evaluation of rectum contours generated by artificial intelligence automatic contouring software using geometry, dosimetry and predicted toxicity.

Mc Laughlin O, Gholami F, Osman S, O'Sullivan JM, McMahon SJ, Jain S, McGarry CK

•papers•Aug 7 2025

ObjectiveThis study assesses rectum contours generated using a commercial deep learning auto-contouring model and compares them to clinician contours using geometry, changes in dosimetry and toxicity modelling. ApproachThis retrospective study involved 308 prostate cancer patients who were treated using 3D-conformal radiotherapy. Computed tomography images were input into Limbus Contour (v1.8.0b3) to generate auto-contour structures for each patient. Auto-contours were not edited after their generation.Rectum auto-contours were compared to clinician contours geometrically and dosimetrically. Dice similarity coefficient (DSC), mean Hausdorff distance (HD) and volume difference were assessed. Dose-volume histogram (DVH) constraints (V41%-V100%) were compared, and a Wilcoxon signed rank test was used to evaluate statistical significance of differences. Toxicity modelling to compare contours was carried out using equivalent uniform dose (EUD) and clinical factors of abdominal surgery and atrial fibrillation. Trained models were tested (80:20) in their prediction of grade 1 late rectal bleeding (ntotal=124) using area-under the receiver operating characteristic curve (AUC).Main resultsMedian DSC (interquartile range (IQR)) was 0.85 (0.09), median HD was 1.38 mm (0.60 mm) and median volume difference was -1.73 cc (14.58 cc). Median DVH differences between contours were found to be small (<1.5%) for all constraints although systematically larger than clinician contours (p<0.05). However, an IQR up to 8.0% was seen for individual patients across all dose constraints.Models using EUD alone derived from clinician or auto-contours had AUCs of 0.60 (0.10) and 0.60 (0.09). AUC for models involving clinical factors and dosimetry was 0.65 (0.09) and 0.66 (0.09) when using clinician contours and auto-contours.SignificanceAlthough median DVH metrics were similar, variation for individual patients highlights the importance of clinician review. Rectal bleeding prediction accuracy did not depend on the contour method for this cohort. The auto-contouring model used in this study shows promise in a supervised workflow.&#xD.

CT Segmentation Abdominal Retrospective Clinical In Silico Academic Lab

Memory-enhanced and multi-domain learning-based deep unrolling network for medical image reconstruction.

Jiang H, Zhang Q, Hu Y, Jin Y, Liu H, Chen Z, Yumo Z, Fan W, Zheng HR, Liang D, Hu Z

•papers•Aug 7 2025

Reconstructing high-quality images from corrupted measurements remains a fundamental challenge in medical imaging. Recently, deep unrolling (DUN) methods have emerged as a promising solution, combining the interpretability of traditional iterative algorithms with the powerful representation capabilities of deep learning. However, their performance is often limited by weak information flow between iterative stages and a constrained ability to capture global features across stages-limitations that tend to worsen as the number of iterations increases.Approach: In this work, we propose a memory-enhanced and multi-domain learning-based deep unrolling network for interpretable, high-fidelity medical image reconstruction. First, a memory-enhanced module is designed to adaptively integrate historical outputs across stages, reducing information loss. Second, we introduce a cross-stage spatial-domain learning transformer (CS-SLFormer) to extract both local and non-local features within and across stages, improving reconstruction performance. Finally, a frequency-domain consistency learning (FDCL) module ensures alignment between reconstructed and ground truth images in the frequency domain, recovering fine image details.Main Results: Comprehensive experiments evaluated on three representative medical imaging modalities (PET, MRI, and CT) show that the proposed method consistently outperforms state-of-the-art (SOTA) approaches in both quantitative metrics and visual quality. Specifically, our method achieved a PSNR of 37.835 dB and an SSIM of 0.970 in 1 $\%$ dose PET reconstruction.Significance: This study expands the use of model-driven deep learning in medical imaging, demonstrating the potential of memory-enhanced deep unrolling frameworks for high-quality reconstructions.

Mixed Modality Reconstruction Methodology In Silico Academic Lab Benchmark SOTA

Longitudinal development of sex differences in the limbic system is associated with age, puberty and mental health

Matte Bon, G., Walther, J., Comasco, E., Derntl, B., Kaufmann, T.

•preprint•Aug 7 2025

Sex differences in mental health become more evident across adolescence, with a two-fold increase of prevalence of mood disorders in females compared to males. The brain underpinnings remain understudied. Here, we investigated the role of age, puberty and mental health in determining the longitudinal development of sex differences in brain structure. We captured sex differences in limbic and non-limbic structures using machine learning models trained in cross-sectional brain imaging data of 1132 youths, yielding limbic and non-limbic estimates of brain sex. Applied to two independent longitudinal samples (total: 8184 youths), our models revealed pronounced sex differences in brain structure with increasing age. For females, brain sex was sensitive to pubertal development (menarche) over time and, for limbic structures, to mood-related mental health. Our findings highlight the limbic system as a key contributor to the development of sex differences in the brain and the potential of machine learning models for brain sex classification to investigate sex-specific processes relevant to mental health.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

Structured Report Generation for Breast Cancer Imaging Based on Large Language Modeling: A Comparative Analysis of GPT-4 and DeepSeek.

Chen K, Hou X, Li X, Xu W, Yi H

•papers•Aug 7 2025

The purpose of this study is to compare the performance of GPT-4 and DeepSeek large language models in generating structured breast cancer multimodality imaging integrated reports from free-text radiology reports including mammography, ultrasound, MRI, and PET/CT. A retrospective analysis was conducted on 1358 free-text reports from 501 breast cancer patients across two institutions. The study design involved synthesizing multimodal imaging data into structured reports with three components: primary lesion characteristics, metastatic lesions, and TNM staging. Input prompts were standardized for both models, with GPT-4 using predesigned instructions and DeepSeek requiring manual input. Reports were evaluated based on physician satisfaction using a Likert scale, descriptive accuracy including lesion localization, size, SUV, and metastasis assessment, and TNM staging correctness according to NCCN guidelines. Statistical analysis included McNemar tests for binary outcomes and correlation analysis for multiclass comparisons with a significance threshold of P < .05. Physician satisfaction scores showed strong correlation between models with r-values of 0.665 and 0.558 and P-values below .001. Both models demonstrated high accuracy in data extraction and integration. The mean accuracy for primary lesion features was 91.7% for GPT-4% and 92.1% for DeepSeek, while feature synthesis accuracy was 93.4% for GPT4 and 93.9% for DeepSeek. Metastatic lesion identification showed comparable overall accuracy at 93.5% for GPT4 and 94.4% for DeepSeek. GPT-4 performed better in pleural lesion detection with 94.9% accuracy compared to 79.5% for DeepSeek, whereas DeepSeek achieved higher accuracy in mesenteric metastasis identification at 87.5% vs 43.8% for GPT4. TNM staging accuracy exceeded 92% for T-stage and 94% for M-stage, with N-stage accuracy improving beyond 90% when supplemented with physical exam data. Both GPT-4 and DeepSeek effectively generate structured breast cancer imaging reports with high accuracy in data mining, integration, and TNM staging. Integrating these models into clinical practice is expected to enhance report standardization and physician productivity.

Mixed Modality LLM Radiology Report Breast Retrospective Clinical In Silico Academic Lab GenAI

A novel approach for CT image smoothing: Quaternion Bilateral Filtering for kernel conversion.

Nasr M, Piórkowski A, Brzostowski K, El-Samie FEA

•papers•Aug 7 2025

Denoising reconstructed Computed Tomography (CT) images without access to raw projection data remains a significant difficulty in medical imaging, particularly when utilizing sharp or medium reconstruction kernels that generate high-frequency noise. This work introduces an innovative method that integrates quaternion mathematics with bilateral filtering to resolve this issue. The proposed Quaternion Bilateral Filter (QBF) effectively maintains anatomical structures and mitigates noise caused by the kernel by expressing CT scans in quaternion form, with the red, green, and blue channels encoded together. Compared to conventional methods that depend on raw data or grayscale filtering, our approach functions directly on reconstructed sharp kernel images. It converts them to mimic the quality of soft-kernel outputs, obtained with kernels such as B30f, using paired data from the same patients. The efficacy of the QBF is evidenced by both full-reference metrics (Structural Similarity Index Measure (SSIM), Peak Signal-to-Noise Ratio (PSNR), Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE)) and no-reference perceptual metrics (Naturalness Image Quality Evaluator (NIQE), Blind Referenceless Image Spatial Quality Evaluator (BRISQUE), and Perception-based Image Quality Evaluator (PIQE)). The results indicate that the QBF demonstrates improved denoising efficacy compared to traditional Bilateral Filter (BF), Non-Local Means (NLM), wavelet, and Convolutional Neural Network (CNN)-based processing, achieving an SSIM of 0.96 and a PSNR of 36.3 on B50f reconstructions. Additionally, segmentation-based visual validation verifies that QBF-filtered outputs maintain essential structural details necessary for subsequent diagnostic tasks. This study emphasizes the importance of quaternion-based filtering as a lightweight, interpretable, and efficient substitute for deep learning models in post-reconstruction CT image enhancement.

CT Reconstruction Methodology In Silico Academic Lab Reproducibility

Robustness evaluation of an artificial intelligence-based automatic contouring software in daily routine practice.

Fontaine J, Suszko M, di Franco F, Leroux A, Bonnet E, Bosset M, Langrand-Escure J, Clippe S, Fleury B, Guy JB

•papers•Aug 7 2025

AI-based automatic contouring streamlines radiotherapy by reducing contouring time but requires rigorous validation and ongoing daily monitoring. This study assessed how software updates affect contouring accuracy and examined how image quality variations influence AI performance. Two patient cohorts were analyzed. The software updates cohort (40 CT scans: 20 thorax, 10 pelvis, 10 H&N) compared six versions of Limbus AI contouring software. The image quality cohort (20 patients: H&N, pelvis, brain, thorax) analyzed 12 reconstructions per patient using Standard, iDose, and IMR algorithms, with simulated noise and spatial resolution (SR) degradations. AI performance was assessed using Volumetric Dice Similarity Coefficient (vDSC) and 95 % Hausdorff Distance (HD95%) with Wilcoxon tests for significance. In the software updates cohort, vDSC improved for re-trained structures across versions (mean DSC ≥ 0.75), with breast contour vDSC decreasing by 1 % between v1.5 and v1.8B3 (p > 0.05). Median HD95% values were consistently <4 mm, <5 mm, and <12 mm for H&N, pelvis, and thorax contours, respectively (p > 0.05). In the image quality cohort, no significant differences were observed between Standard, iDose, and IMR algorithms. However, noise and SR degradation significantly reduced performance: vDSC ≥ 0.9 dropped from 89 % at 2 % noise to 30 % at 20 %, and from 87 % to 70 % as SR degradation increased (p < 0.001). AI contouring accuracy improved with software updates and showed robustness to minor reconstruction variations, but it was sensitive to noise and SR degradation. Continuous validation and quality control of AI-generated contours are essential. Future studies should include a broader range of anatomical regions and larger cohorts.

CT Segmentation Retrospective Clinical Clinical Pilot Startup

MedCLIP-SAMv2: Towards universal text-driven medical image segmentation.

Koleilat T, Asgariandehkordi H, Rivaz H, Xiao Y

•papers•Aug 7 2025

Segmentation of anatomical structures and pathologies in medical images is essential for modern disease diagnosis, clinical research, and treatment planning. While significant advancements have been made in deep learning-based segmentation techniques, many of these methods still suffer from limitations in data efficiency, generalizability, and interactivity. As a result, developing robust segmentation methods that require fewer labeled datasets remains a critical challenge in medical image analysis. Recently, the introduction of foundation models like CLIP and Segment-Anything-Model (SAM), with robust cross-domain representations, has paved the way for interactive and universal image segmentation. However, further exploration of these models for data-efficient segmentation in medical imaging is an active field of research. In this paper, we introduce MedCLIP-SAMv2, a novel framework that integrates the CLIP and SAM models to perform segmentation on clinical scans using text prompts, in both zero-shot and weakly supervised settings. Our approach includes fine-tuning the BiomedCLIP model with a new Decoupled Hard Negative Noise Contrastive Estimation (DHN-NCE) loss, and leveraging the Multi-modal Information Bottleneck (M2IB) to create visual prompts for generating segmentation masks with SAM in the zero-shot setting. We also investigate using zero-shot segmentation labels in a weakly supervised paradigm to enhance segmentation quality further. Extensive validation across four diverse segmentation tasks and medical imaging modalities (breast tumor ultrasound, brain tumor MRI, lung X-ray, and lung CT) demonstrates the high accuracy of our proposed framework. Our code is available at https://github.com/HealthX-Lab/MedCLIP-SAMv2.

Mixed Modality Segmentation Methodology In Silico Academic Lab Open Code GenAI

Hybrid Neural Networks for Precise Hydronephrosis Classification Using Deep Learning.

Salam A, Naznine M, Chowdhury MEH, Agzamkhodjaev S, Tekin A, Vallasciani S, Ramírez-Velázquez E, Abbas TO

•papers•Aug 7 2025

To develop and evaluate a deep learning framework for automatic kidney and fluid segmentation in renal ultrasound images, aiming to enhance diagnostic accuracy and reduce variability in hydronephrosis assessment. A dataset of 1,731 renal ultrasound images, annotated by four experienced urologists, was used for model training and evaluation. The proposed framework integrates a DenseNet201 backbone, Feature Pyramid Network (FPN), and Self-Organizing Neural Network (SelfONN) layers to enable multi-scale feature extraction and improve spatial precision. Several architectures were tested under identical conditions to ensure fair comparison. Segmentation performance was assessed using standard metrics, including Dice coefficient, precision, and recall. The framework also supported hydronephrosis classification using the fluid-to-kidney area ratio, with a threshold of 0.213 derived from prior literature. The model achieved strong segmentation performance for kidneys (Dice: 0.92, precision: 0.93, recall: 0.91) and fluid regions (Dice: 0.89, precision: 0.90, recall: 0.88), outperforming baseline methods. The classification accuracy for detecting hydronephrosis reached 94%, based on the computed fluid-to-kidney ratio. Performance was consistent across varied image qualities, reflecting the robustness of the overall architecture. This study presents an automated, objective pipeline for analyzing renal ultrasound images. The proposed framework supports high segmentation accuracy and reliable classification, facilitating standardized and reproducible hydronephrosis assessment. Future work will focus on model optimization and incorporating explainable AI to enhance clinical integration.

Ultrasound Segmentation Abdominal Methodology In Silico

Artificial intelligence in forensic neuropathology: A systematic review.

Treglia M, La Russa R, Napoletano G, Ghamlouch A, Del Duca F, Treves B, Frati P, Maiese A

•papers•Aug 7 2025

In recent years, Artificial Intelligence (AI) has gained prominence as a robust tool for clinical decision-making and diagnostics, owing to its capacity to process and analyze large datasets with high accuracy. More specifically, Deep Learning, and its subclasses, have shown significant potential in image processing, including medical imaging and histological analysis. In forensic pathology, AI has been employed for the interpretation of histopathological data, identifying conditions such as myocardial infarction, traumatic injuries, and heart rhythm abnormalities. This review aims to highlight key advances in AI's role, particularly machine learning (ML) and deep learning (DL) techniques, in forensic neuropathology, with a focus on its ability to interpret instrumental and histopathological data to support professional diagnostics. A systematic review of the literature regarding applications of Artificial Intelligence in forensic neuropathology was carried out according to the Preferred Reporting Item for Systematic Review (PRISMA) standards. We selected 34 articles regarding the main applications of AI in this field, dividing them into two categories: those addressing traumatic brain injury (TBI), including intracranial hemorrhage or cerebral microbleeds, and those focusing on epilepsy and SUDEP, including brain disorders and central nervous system neoplasms capable of inducing seizure activity. In both cases, the application of AI techniques demonstrated promising results in the forensic investigation of cerebral pathology, providing a valuable computer-assisted diagnostic tool to aid in post-mortem computed tomography (PMCT) assessments of cause of death and histopathological analyses. In conclusion, this paper presents a comprehensive overview of the key neuropathology areas where the application of artificial intelligence can be valuable in investigating causes of death.

CT Classification Neurological Review Concept

Automated detection of wrist ganglia in MRI using convolutional neural networks.

Hämäläinen M, Sormaala M, Kaseva T, Salli E, Savolainen S, Kangasniemi M

•papers•Aug 7 2025

To investigate feasibility of a method which combines segmenting convolutional neural networks (CNN) for the automated detection of ganglion cysts in 2D MRI of the wrist. The study serves as proof-of-concept, demonstrating a method to decrease false positives and offering an efficient solution for ganglia detection. We retrospectively analyzed 58 MRI studies with wrist ganglia, each including 2D axial, sagittal, and coronal series. Manual segmentations were performed by a radiologist and used to train CNNs for automatic segmentation of each orthogonal series. Predictions were fused into a single 3D volume using a proposed prediction fusion method. Performance was evaluated over all studies using six-fold cross-validation, comparing method variations with metrics including true positive rate, number of false positives, and F-score metrics. The proposed method reached mean TPR of 0.57, mean FP of 0.4 and mean F-score of 0.53. Fusion of series predictions decreased the number of false positives significantly but also decreased TPR values. CNNs can detect ganglion cysts in wrist MRI. The number of false positives can be decreased by a method of prediction fusion from multiple CNNs.

MRI Detection Musculoskeletal Retrospective Clinical In Silico

Filter Papers

Tags

An evaluation of rectum contours generated by artificial intelligence automatic contouring software using geometry, dosimetry and predicted toxicity.

Memory-enhanced and multi-domain learning-based deep unrolling network for medical image reconstruction.

Longitudinal development of sex differences in the limbic system is associated with age, puberty and mental health

Structured Report Generation for Breast Cancer Imaging Based on Large Language Modeling: A Comparative Analysis of GPT-4 and DeepSeek.

A novel approach for CT image smoothing: Quaternion Bilateral Filtering for kernel conversion.

Robustness evaluation of an artificial intelligence-based automatic contouring software in daily routine practice.

MedCLIP-SAMv2: Towards universal text-driven medical image segmentation.

Hybrid Neural Networks for Precise Hydronephrosis Classification Using Deep Learning.

Artificial intelligence in forensic neuropathology: A systematic review.

Automated detection of wrist ganglia in MRI using convolutional neural networks.

Ready to Sharpen Your Edge?