Latest Papers on Radiology AI.

Subvisual imaging signals as biomarkers of impending lung metastasis: A multicenter pan-cancer study.

Zhang R, Li H, Ding L, Liu H, Zhu L, Wu Z, Chen Q, Liu Q, Wang J, Li S, Ruan G, Wu Y, Zhang W, Liang X, Wang J, Wang Y, Yu T, Yan J, Wang R, Wu Z, Qiu S, Chen K, Song E

•papers•Oct 20 2025

Early detection of distant metastases is crucial, but current imaging detects them only when radiographically visible. This study reported subvisual chest CT signals could serve as early biomarkers for impending lung metastasis before radiological visibility. This multicenter study enrolled breast, colorectal, and esophageal cancer patients from four hospitals, with at least three follow-up chest CT scans per patient. Signaling features were extracted from 3D lung regions of interest (ROIs). Using Multi-TimePoint Modeling approach, we analyzed features across time points to develop machine learning models, with performance evaluated by the area under the curve. A predefined cutoff classified patients as Signal-Positive or Signal-Negative, and the actual lung metastasis risk was compared between these groups. The lead time that spanned from first Signal-Positivity to metastasis, and the impact of CT scan count and interval on model performance were explored. This study analyzed 10,280 follow-up chest CT scans from 2148 cancer patients. Signal-Positive patients showed significantly higher actual lung metastasis risk than Signal-Negative patients: 57.14 % vs 5.77 % (breast, adjusted p < 0.0001), 66.67 % vs 6.25 % (colorectal, adjusted p = 0.0361), and 50.00 % vs 12.50 % (esophageal, adjusted p = 0.0480). The lead times were 0.84 years (breast), 1.41 years (colorectal), and 0.83 years (esophageal). At least two CT scans within 1.5 years (breast/colorectal cancer) or 0.5 years (esophageal cancer) are recommended for model application. Subvisual chest CT signals serve as biomarkers for impending lung metastasis detection across cancers. This non-invasive approach dynamically identifies high-risk patients, enabling possible early intervention.

CT Classification Chest Retrospective Clinical In Silico Breakthrough

Prompt Engineering Enables Open-Source LLMs to Match Proprietary Models in Diagnostic Accuracy for Annotation of Radiology Reports

Petersen, L. A., Beck, M. S., Xu, J. J., Andersen, M. B., Bruun, F. J.

•preprint•Oct 20 2025

AimThe aim of this study was to test whether open-source Large Language Models (LLMs) can match the diagnostic accuracy of proprietary models in annotating Danish trauma radiology reports across three clinical findings. Materials and MethodsThis retrospective study included 2,939 radiology reports of trauma radiographs collected from three Danish emergency departments. The data were split, with 600 cases for prompt engineering and 2,339 for model evaluation. Eight LLMs, GPT-4o and GPT-4o-mini (OpenAI), and six Llama3 variants (Meta) were prompted to annotate the reports for fractures, effusions, and luxations. The reference standard was human annotations. The diagnostic performance was assessed using accuracy, sensitivity, specificity, PPV, and NPV with 95% confidence intervals. ResultsPrompt engineering improved the Match-score for Llama3-8b from 77.8% (95% CI: 74.4% - 81.1%) to 94.3% (95% CI: 92.5% - 96.2%). GPT-4o achieved the highest overall diagnostic accuracy at 97.9% (95% CI: 97.3% - 98.5%), followed by Llama3.1-405b (97.1% (95% CI: 96.4% - 97.8%)), GPT-4o-mini (96.9% (95% CI: 96.2% - 97.6%)), Llama3-8b (96.9% (95% CI: 95.9% - 97.3%)), and Llama3.1-70b (96.0% (95% CI: 95.2% - 96.8%)). Across the three specific findings, all models performed best for fractures, whereas effusion and luxation were more prone to errors. Of the error types, Semantic Confusion was the most frequent, with 53.2% to 59.4% of misclassifications. ConclusionSmall, open-source LLMs can accurately annotate Danish trauma radiology reports when supported by effective prompt engineering, achieving accuracy levels that rival proprietary competitors. They offer a viable, privacy-conscious alternative for clinical use, even in a low-resource language setting.

X-Ray LLM Radiology Report Musculoskeletal Retrospective Clinical In Silico Academic Lab GenAI

Automated Multimodal Image Registration for Prostate Cancer Using Squeeze-and-Excitation ResNet with Thin Plate Spline Transformation: A Deep Learning Approach.

Cai H, Li Y, Zhao Q, Lu Y

•papers•Oct 19 2025

BACKGROUND Accurate spatial correlation between preoperative prostate MRI and post-prostatectomy histopathology is critical for improving prostate cancer diagnosis, treatment planning, and MRI interpretation. Current manual registration methods are time-consuming and subjective, creating a need for robust, automated solutions. MATERIAL AND METHODS We developed an unsupervised deep learning model, Squeeze-and-Excitation ResNet with Thin-Plate Spline Transformation (SE-ResNet-TPS), for deformable registration of in vivo prostate MRI and ex vivo whole-mount histopathology images. The model learns feature correspondence directly from unlabeled image pairs, eliminating dependency on large annotated datasets. It integrates multi-scale convolutional kernels within a ResNet architecture and incorporates a channel attention mechanism (Squeeze-and-Excitation) to enhance sensitivity to diagnostically relevant features. A thin-plate spline (TPS) transformation module is employed to model complex global and local deformations between the inherently different modalities. RESULTS The SE-ResNet-TPS model was rigorously evaluated. It achieved an overall Dice similarity coefficient (DSC) of 0.964 and a Hausdorff distance (HD) of 2.91, indicating excellent anatomical alignment between the registered MRI and histopathology images. For cancer-specific regions of interest, where registration is most challenging, the model yielded a DSC of 0.578 and an HD of 4.97, demonstrating significant capability in aligning clinically critical areas despite modality differences and tissue processing artifacts. CONCLUSIONS The proposed SE-ResNet-TPS framework provides highly accurate, unsupervised registration of prostate MRI and histopathology images. Its performance, particularly in aligning overall anatomy, confirms its effectiveness for multimodal prostate image fusion. While cancer-specific alignment presents greater challenges, the results are promising. This model has strong potential to enhance the precision of prostate cancer localization on MRI, ultimately supporting radiologists in diagnosis and targeted biopsy guidance.

Mixed Modality Registration Abdominal Methodology In Silico

Real-world diagnostic performance of knee MRI protocols accelerated using simultaneous multi-slice acquisition and deep learning reconstruction.

Johnson PM, Dogra S, Westerhoff M, Fritz J, Lin DJ, Recht MP

•papers•Oct 19 2025

To assess whether accelerated knee MRI protocols using simultaneous multi-slice (SMS) and deep learning reconstruction (DLR) are non-inferior to a conventional parallel imaging protocol for detecting internal derangement injuries. This retrospective cohort study included 1055 patients who underwent knee MRI followed by arthroscopy within 180 days. Patients were scanned using either a conventional protocol (n = 226), an accelerated SMS protocol (n = 406), or a SMS with DLR protocol (n = 423). Each group included consecutive exams. Imaging was performed on 3 T MRI using five standardized two-dimensional turbo spin echo sequences. Radiology interpretations were compared with arthroscopy (reference standard) for anterior cruciate ligament (ACL), medial meniscus (MM), and lateral meniscus (LM) tears. Sensitivity and specificity were calculated with 95% confidence intervals using non-parametric bootstrapping. Non-inferiority was concluded if the upper bound of the 95% confidence interval for the difference in sensitivity and specificity was ≤ 0.05. Among all patients, 666 had MM tears, 417 had LM tears, and 220 had ACL tears. Sensitivity for ACL tears was higher with accelerated protocols (0.96 and 0.98) than the conventional (0.85), with non-inferiority confirmed. Specificity was ≥ 0.98 across all protocols. MM sensitivity (0.94-0.95) met non-inferiority criteria. MM specificity (0.88-0.91) and LM sensitivity (0.63-0.68) were not statistically different across protocols but did not meet the non-inferiority margin. LM specificity (0.94) met non-inferiority criteria. Accelerated MRI protocols using SMS and DLR demonstrated comparable diagnostic performance to the reference protocol. Although not all metrics met the strict non-inferiority margin, none showed statistically significant reductions in sensitivity or specificity. These findings support the clinical adoption of accelerated protocols for faster, high-throughput knee imaging.

MRI Reconstruction Musculoskeletal Retrospective Clinical Clinical Pilot Academic Lab Benchmark SOTA

Click, Predict, Trust: Clinician-in-the-Loop AI Segmentation for Lung Cancer CT-Based Prognosis within the Knowledge-to-Action Framework

Mohammad R. Salmanpour, Sonya Falahati, Amir Hossein Pouria, Amin Mousavi, Somayeh Sadat Mehrnia, Morteza Alizadeh, Arman Gorji, Zeinab Farsangi, Alireza Safarian, Mehdi Maghsudi, Carlos Uribe, Arman Rahmim, Ren Yuan

•preprint•Oct 19 2025

Lung cancer remains the leading cause of cancer mortality, with CT imaging central to screening, prognosis, and treatment. Manual segmentation is variable and time-intensive, while deep learning (DL) offers automation but faces barriers to clinical adoption. Guided by the Knowledge-to-Action framework, this study develops a clinician-in-the-loop DL pipeline to enhance reproducibility, prognostic accuracy, and clinical trust. Multi-center CT data from 999 patients across 12 public datasets were analyzed using five DL models (3D Attention U-Net, ResUNet, VNet, ReconNet, SAM-Med3D), benchmarked against expert contours on whole and click-point cropped images. Segmentation reproducibility was assessed using 497 PySERA-extracted radiomic features via Spearman correlation, ICC, Wilcoxon tests, and MANOVA, while prognostic modeling compared supervised (SL) and semi-supervised learning (SSL) across 38 dimensionality reduction strategies and 24 classifiers. Six physicians qualitatively evaluated masks across seven domains, including clinical meaningfulness, boundary quality, prognostic value, trust, and workflow integration. VNet achieved the best performance (Dice = 0.83, IoU = 0.71), radiomic stability (mean correlation = 0.76, ICC = 0.65), and predictive accuracy under SSL (accuracy = 0.88, F1 = 0.83). SSL consistently outperformed SL across models. Radiologists favored VNet for peritumoral representation and smoother boundaries, preferring AI-generated initial masks for refinement rather than replacement. These results demonstrate that integrating VNet with SSL yields accurate, reproducible, and clinically trusted CT-based lung cancer prognosis, highlighting a feasible path toward physician-centered AI translation.

CT Segmentation Chest Methodology In Silico Academic Lab Benchmark SOTA

Foundation Models in Medical Image Analysis: A Systematic Review and Meta-Analysis

Praveenbalaji Rajendran, Mojtaba Safari, Wenfeng He, Mingzhe Hu, Shansong Wang, Jun Zhou, Xiaofeng Yang

•preprint•Oct 19 2025

Recent advancements in artificial intelligence (AI), particularly foundation models (FMs), have revolutionized medical image analysis, demonstrating strong zero- and few-shot performance across diverse medical imaging tasks, from segmentation to report generation. Unlike traditional task-specific AI models, FMs leverage large corpora of labeled and unlabeled multimodal datasets to learn generalized representations that can be adapted to various downstream clinical applications with minimal fine-tuning. However, despite the rapid proliferation of FM research in medical imaging, the field remains fragmented, lacking a unified synthesis that systematically maps the evolution of architectures, training paradigms, and clinical applications across modalities. To address this gap, this review article provides a comprehensive and structured analysis of FMs in medical image analysis. We systematically categorize studies into vision-only and vision-language FMs based on their architectural foundations, training strategies, and downstream clinical tasks. Additionally, a quantitative meta-analysis of the studies was conducted to characterize temporal trends in dataset utilization and application domains. We also critically discuss persistent challenges, including domain adaptation, efficient fine-tuning, computational constraints, and interpretability along with emerging solutions such as federated learning, knowledge distillation, and advanced prompting. Finally, we identify key future research directions aimed at enhancing the robustness, explainability, and clinical integration of FMs, thereby accelerating their translation into real-world medical practice.

Mixed Modality Classification Whole Body Review Concept Academic Lab GenAI Benchmark SOTA Open Dataset Policy

EMRRG: Efficient Fine-Tuning Pre-trained X-ray Mamba Networks for Radiology Report Generation

Mingzheng Zhang, Jinfeng Gao, Dan Xu, Jiangrui Yu, Yuhan Qiao, Lan Chen, Jin Tang, Xiao Wang

•preprint•Oct 19 2025

X-ray image-based medical report generation (MRG) is a pivotal area in artificial intelligence that can significantly reduce diagnostic burdens for clinicians and patient wait times. Existing MRG models predominantly rely on Large Language Models (LLMs) to improve report generation, with limited exploration of pre-trained vision foundation models or advanced fine-tuning techniques. Mainstream frameworks either avoid fine-tuning or utilize simplistic methods like LoRA, often neglecting the potential of enhancing cross-attention mechanisms. Additionally, while Transformer-based models dominate vision-language tasks, non-Transformer architectures, such as the Mamba network, remain underexplored for medical report generation, presenting a promising avenue for future research. In this paper, we propose EMRRG, a novel X-ray report generation framework that fine-tunes pre-trained Mamba networks using parameter-efficient methods. Specifically, X-ray images are divided into patches, tokenized, and processed by an SSM-based vision backbone for feature extraction, with Partial LoRA yielding optimal performance. An LLM with a hybrid decoder generates the medical report, enabling end-to-end training and achieving strong results on benchmark datasets. Extensive experiments on three widely used benchmark datasets fully validated the effectiveness of our proposed strategies for the X-ray MRG. The source code of this paper will be released on https://github.com/Event-AHU/Medical_Image_Analysis.

X-Ray Report Generation Chest Methodology In Silico Academic Lab GenAI Open Code

Multicenter deep Learning-Based automatic delineation of CTV and PTV in uterine malignancy CT imaging.

Xu B, Liu J, Fang M, Zhu H, Zhang Y, Zhang H, Lu X, Luo J

•papers•Oct 19 2025

Accurate delineation of the clinical target volume (CTV) and planning target volume (PTV) is essential for effective radiotherapy in uterine malignancies. Manual contouring is laborious, time-consuming, and subjective, and current automatic methods often focus on a single cancer type with limited external validation. To address this, we developed a deep-learning model capable of accurately delineating both CTV and PTV across multiple uterine malignancies using CT imaging. We retrospectively collected 602 contrast-enhanced CT scans, comprising 302 cases (cervical and endometrial cancers) from our institution and an additional 300 cervical cancer scans from external centers. Expert radiation oncologists manually delineated the CTV and PTV on each image. Among the 302 internal cancer cases, 177 cervical cancer cases were used for model training with five-fold cross-validation. Additionally, 41 cervical cancer cases were reserved as an internal testing cohort, while 84 endometrial cancer cases constituted the first external testing cohort to assess the model's generalizability across cancer types. The remaining 300 cervical cancer scans from external centers formed a second external testing cohort to assess model robustness across institutions. We evaluated three segmentation architectures-2D, full-resolution 3D, and cascaded 3D networks-and measured their performance using three standard metrics: Dice Similarity Coefficient (DSC), 95 % Hausdorff Distance (HD95), and Average Surface Distance (ASD). The model-generated segmentations demonstrated strong concordance with the expert contours. In the internal testing cohort with the same cancer type, performance metrics (DSC, HD95, ASD) were consistently high. Similarly, the external testing cohort with different cancer types showed robust performance, indicating effective generalizability. On the internal testing cohort, the model demonstrated strong performance, achieving mean DSCs of 83.42 % for PTV and 81.23 % for CTV, with low spatial errors (PTV: ASD 2.01 mm, HD95 5.71 mm; CTV: ASD 1.35 mm, HD95 4.75 mm). In the endometrial cancer cohort, PTV segmentation achieved a DSC of 82.88 %, while CTV segmentation yielded an HD95 of 5.85  mm and an ASD of 1.34  mm. Additionally, clinical evaluation revealed that approximately 90 % of the model-generated contours required no or only minor revision. We present a multicenter-validated deep-learning based framework for automatic CTV and PTV delineation across diverse uterine malignancies on CT. Our model offers a scalable, generalized solution with the potential to reduce the workload in radiation oncology, improve consistency, and streamline clinical workflows.

CT Segmentation Abdominal Retrospective Clinical In Silico Academic Lab

Machine Learning based Point-of-Care Disease Diagnostics using Dried patterns formed by E. coli bacteria-laden Sessile Urine Droplets

Ganesh, M. A., Vaikuntanathan, V., M, S., Poopady, J. J., Rasheed, A., Roy, D., Chakravortyy, D., Basu, S.

•preprint•Oct 19 2025

Urinary Tract Infection (UTI), primarily caused by E. coli bacteria, is a rising global health concern, affecting women and the elderly at a disproportionately high rate. Despite recent advancements in the diagnosis techniques, a critical gap still persists in terms of time delay, high cost and false-positive predictions. Hence there is a critical requirement for rapid, reliable and cheap point-of-care diagnostic tools. As a vital step towards addressing this need, here we propose an Artificial Intelligence based diagnosis technique which involves microscopic images of dried patterns formed by E. coli bacteria-laden sessile urine droplets. In this study, the variation in the underlying pattern formation behavior with the change in bacterial concentration has been perceived through machine learning (deep residual network based) model pipeline for diagnosis and severity estimation. Image classification (pattern analysis) has been performed based on dried deposits/patterns obtained from evaporated bacteria-laden sessile urine droplets. In addition, the impact of bacterial concentration in a given urine sample has been studied, as an attempt to qualitatively estimate a severity index. Overall, this study focuses on understanding and unleashing the potential of analyzing dried deposit/pattern for UTI diagnosis, as a quick, cheap and accessible point-of-care application, which can be extended to cyber-physical systems as a robust, deployable diagnostic tool particularly in rural areas as a first-line diagnostic tool, allowing users to perform an initial self-assessment prior to consulting a medical professional.

Mixed Modality Classification Abdominal Methodology Prototype Academic Lab

An Explainable Hybrid CNN-Transformer Framework with Aquila Optimization for MRI-Based Brain Tumor

Thottempudi, P., Acharya, B., Aouthu, S., Narra, D., B, M. B., K, R. M., K, S., Mallik, S.

•preprint•Oct 19 2025

Accurate and interpretable brain tumor classification remains a critical challenge due to the heterogeneity of tumor types and the complexity of MRI data. This paper presents a hybrid deep learning framework that synergizes Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) for multi-class brain tumor diagnosis. The model leverages CNNs for localized spatial feature extraction and ViTs for capturing long-range contextual information, followed by an attention-guided fusion mechanism. To enhance generalization and reduce feature redundancy, an Improved Aquila Optimizer (AQO) is employed for metaheuristic feature selection. The model is trained and evaluated on the Kaggle brain MRI dataset, comprising 3,264 T1-weighted contrast-enhanced axial slices categorized into four classes: glioma, meningioma, pituitary tumor, and no tumor. To ensure interpretability, SHAP and Grad-CAM are integrated to visualize both semantic and spatial relevance in predictions. The proposed method achieves a classification accuracy of 97.2%, F1-score of 0.96, and AUC-ROC of 0.98, outperforming baseline CNN and ViT models.

MRI Classification Neurological Methodology In Silico GenAI

Filter Papers

Tags

Subvisual imaging signals as biomarkers of impending lung metastasis: A multicenter pan-cancer study.

Prompt Engineering Enables Open-Source LLMs to Match Proprietary Models in Diagnostic Accuracy for Annotation of Radiology Reports

Automated Multimodal Image Registration for Prostate Cancer Using Squeeze-and-Excitation ResNet with Thin Plate Spline Transformation: A Deep Learning Approach.

Real-world diagnostic performance of knee MRI protocols accelerated using simultaneous multi-slice acquisition and deep learning reconstruction.

Click, Predict, Trust: Clinician-in-the-Loop AI Segmentation for Lung Cancer CT-Based Prognosis within the Knowledge-to-Action Framework

Foundation Models in Medical Image Analysis: A Systematic Review and Meta-Analysis

EMRRG: Efficient Fine-Tuning Pre-trained X-ray Mamba Networks for Radiology Report Generation

Multicenter deep Learning-Based automatic delineation of CTV and PTV in uterine malignancy CT imaging.

Machine Learning based Point-of-Care Disease Diagnostics using Dried patterns formed by E. coli bacteria-laden Sessile Urine Droplets

An Explainable Hybrid CNN-Transformer Framework with Aquila Optimization for MRI-Based Brain Tumor

Ready to Sharpen Your Edge?