Latest Papers on Radiology AI. Tags: Benchmark SOTA

When Age Is More Than a Number: Acceleration of Brain Aging in Neurodegenerative Diseases.

Doering E, Hoenig MC, Cole JH, Drzezga A

•papers•Aug 21 2025

Aging of the brain is characterized by deleterious processes at various levels including cellular/molecular and structural/functional changes. Many of these processes can be assessed in vivo by means of modern neuroimaging procedures, allowing the quantification of brain age in different modalities. Brain age can be measured by suitable machine learning strategies. The deviation (in both directions) between a person's measured brain age and chronologic age is referred to as the brain age gap (BAG). Although brain age, as defined by these methods, generally is related to the chronologic age of a person, this relationship is not always parallel and can also vary significantly between individuals. Importantly, whereas neurodegenerative disorders are not equivalent to accelerated brain aging, they may induce brain changes that resemble those of older adults, which can be captured by brain age models. Inversely, healthy brain aging may involve a resistance or delay of the onset of neurodegenerative pathologies in the brain. This continuing education article elaborates how the BAG can be computed and explores how BAGs, derived from diverse neuroimaging modalities, offer unique insights into the phenotypes of age-related neurodegenerative diseases. Structural BAGs from T1-weighted MRI have shown promise as phenotypic biomarkers for monitoring neurodegenerative disease progression especially in Alzheimer disease. Additionally, metabolic and molecular BAGs from molecular imaging, functional BAGs from functional MRI, and microstructural BAGs from diffusion MRI, although researched considerably less, each may provide distinct perspectives on particular brain aging processes and their deviations from healthy aging. We suggest that BAG estimation, when based on the appropriate modality, could potentially be useful for disease monitoring and offer interesting insights concerning the impact of therapeutic interventions.

Mixed Modality Registration Neurological Review In Silico Academic Lab Benchmark SOTA

Multimodal Integration in Health Care: Development With Applications in Disease Management.

Hao Y, Cheng C, Li J, Li H, Di X, Zeng X, Jin S, Han X, Liu C, Wang Q, Luo B, Zeng X, Li K

•papers•Aug 21 2025

Multimodal data integration has emerged as a transformative approach in the health care sector, systematically combining complementary biological and clinical data sources such as genomics, medical imaging, electronic health records, and wearable device outputs. This approach provides a multidimensional perspective of patient health that enhances the diagnosis, treatment, and management of various medical conditions. This viewpoint presents an overview of the current state of multimodal integration in health care, spanning clinical applications, current challenges, and future directions. We focus primarily on its applications across different disease domains, particularly in oncology and ophthalmology. Other diseases are briefly discussed due to the few available literature. In oncology, the integration of multimodal data enables more precise tumor characterization and personalized treatment plans. Multimodal fusion demonstrates accurate prediction of anti-human epidermal growth factor receptor 2 therapy response (area under the curve=0.91). In ophthalmology, multimodal integration through the combination of genetic and imaging data facilitates the early diagnosis of retinal diseases. However, substantial challenges remain regarding data standardization, model deployment, and model interpretability. We also highlight the future directions of multimodal integration, including its expanded disease applications, such as neurological and otolaryngological diseases, and the trend toward large-scale multimodal models, which enhance accuracy. Overall, the innovative potential of multimodal integration is expected to further revolutionize the health care industry, providing more comprehensive and personalized solutions for disease management.

Mixed Modality Classification Review Concept Academic Lab GenAI Benchmark SOTA

Vision Transformer Autoencoders for Unsupervised Representation Learning: Revealing Novel Genetic Associations through Learned Sparse Attention Patterns

Islam, S. R., He, W., Xie, Z., Zhi, D.

•preprint•Aug 21 2025

The discovery of genetic loci associated with brain architecture can provide deeper insights into neuroscience and potentially lead to improved personalized medicine outcomes. Previously, we designed the Unsupervised Deep learning-derived Imaging Phenotypes (UDIPs) approach to extract phenotypes from brain imaging using a convolutional (CNN) autoencoder, and conducted brain imaging GWAS on UK Biobank (UKBB). In this work, we design a vision transformer (ViT)-based autoencoder, leveraging its distinct inductive bias and its ability to capture unique patterns through its pairwise attention mechanism. The encoder generates contextual embeddings for input patches, from which we derive a 128-dimensional latent representation, interpreted as phenotypes, by applying average pooling. The GWAS on these 128 phenotypes discovered 10 loci previously unreported by CNN-based UDIP model, 3 of which had no previous associations with brain structure in the GWAS Catalog. Our interpretation results suggest that these novel associations stem from the ViTs capability to learn sparse attention patterns, enabling the capturing of non-local patterns such as left-right hemisphere symmetry within brain MRI data. Our results highlight the advantages of transformer-based architectures in feature extraction and representation learning for genetic discovery.

MRI Classification Neurological Methodology In Silico Academic Lab Benchmark SOTA

Deep Learning-Assisted Skeletal Muscle Radiation Attenuation at C3 Predicts Survival in Head and Neck Cancer

Barajas Ordonez, F., Xie, K., Ferreira, A., Siepmann, R., Chargi, N., Nebelung, S., Truhn, D., Berge, S., Bruners, P., Egger, J., Hölzle, F., Wirth, M., Kuhl, C., Puladi, B.

•preprint•Aug 21 2025

BackgroundHead and neck cancer (HNC) patients face an increased risk of malnutrition due to lifestyle, tumor localization, and treatment effects. While skeletal muscle area (SMA) and radiation attenuation (SM-RA) at the third lumbar vertebra (L3) are established prognostic markers, L3 is not routinely available in head and neck imaging. The prognostic value of SM-RA at the third cervical vertebra (C3) remains unclear. This study assesses whether SMA and SM-RA at C3 predict locoregional control (LRC) and overall survival (OS) in HNC. MethodsWe analyzed 904 HNC cases with head and neck CT scans. A deep learning pipeline identified C3, and SMA/SM-RA were quantified via automated segmentation with manual verification. Cox proportional hazards models assessed associations with LRC and OS, adjusting for clinical factors. ResultsMedian SMA and SM-RA were 36.64 cm{superscript 2} (IQR: 30.12-42.44) and 50.77 HU (IQR: 43.04-57.39). In multivariate analysis, lower SMA (HR 1.62, 95% CI: 1.02-2.58, p = 0.04), lower SM-RA (HR 1.89, 95% CI: 1.30-2.79, p < 0.001), and advanced T stage (HR 1.50, 95% CI: 1.06-2.12, p = 0.02) were prognostic for LRC. OS predictors included advanced T stage (HR 2.17, 95% CI: 1.64-2.87, p < 0.001), age [≥]70 years (HR 1.40, 95% CI: 1.00-1.96, p = 0.05), male sex (HR 1.64, 95% CI: 1.02-2.63, p = 0.04), and lower SM-RA (HR 2.15, 95% CI: 1.56-2.96, p < 0.001). ConclusionDeep learning-assisted SM-RA assessment at C3 outperforms SMA for LRC and OS in HNC, supporting its use as a routine biomarker and L3 alternative.

CT Segmentation Retrospective Clinical In Silico Academic Lab Benchmark SOTA

From Detection to Diagnosis: An Advanced Transfer Learning Pipeline Using YOLO11 with Morphological Post-Processing for Brain Tumor Analysis for MRI Images.

Chourib I

•papers•Aug 21 2025

Accurate and timely detection of brain tumors from magnetic resonance imaging (MRI) scans is critical for improving patient outcomes and informing therapeutic decision-making. However, the complex heterogeneity of tumor morphology, scarcity of annotated medical data, and computational demands of deep learning models present substantial challenges for developing reliable automated diagnostic systems. In this study, we propose a robust and scalable deep learning framework for brain tumor detection and classification, built upon an enhanced YOLO-v11 architecture combined with a two-stage transfer learning strategy. The first stage involves training a base model on a large, diverse MRI dataset. Upon achieving a mean Average Precision (mAP) exceeding 90%, this model is designated as the Brain Tumor Detection Model (BTDM). In the second stage, the BTDM is fine-tuned on a structurally similar but smaller dataset to form Brain Tumor Detection and Segmentation (BTDS), effectively leveraging domain transfer to maintain performance despite limited data. The model is further optimized through domain-specific data augmentation-including geometric transformations-to improve generalization and robustness. Experimental evaluations on publicly available datasets show that the framework achieves high [email protected] scores (up to 93.5% for the BTDM and 91% for BTDS) and consistently outperforms existing state-of-the-art methods across multiple tumor types, including glioma, meningioma, and pituitary tumors. In addition, a post-processing module enhances interpretability by generating segmentation masks and extracting clinically relevant metrics such as tumor size and severity level. These results underscore the potential of our approach as a high-performance, interpretable, and deployable clinical decision-support tool, contributing to the advancement of intelligent real-time neuro-oncological diagnostics.

MRI Detection Neurological Methodology In Silico Benchmark SOTA

TPA: Temporal Prompt Alignment for Fetal Congenital Heart Defect Classification

Darya Taratynova, Alya Almsouti, Beknur Kalmakhanbet, Numan Saeed, Mohammad Yaqub

•preprint•Aug 21 2025

Congenital heart defect (CHD) detection in ultrasound videos is hindered by image noise and probe positioning variability. While automated methods can reduce operator dependence, current machine learning approaches often neglect temporal information, limit themselves to binary classification, and do not account for prediction calibration. We propose Temporal Prompt Alignment (TPA), a method leveraging foundation image-text model and prompt-aware contrastive learning to classify fetal CHD on cardiac ultrasound videos. TPA extracts features from each frame of video subclips using an image encoder, aggregates them with a trainable temporal extractor to capture heart motion, and aligns the video representation with class-specific text prompts via a margin-hinge contrastive loss. To enhance calibration for clinical reliability, we introduce a Conditional Variational Autoencoder Style Modulation (CVAESM) module, which learns a latent style vector to modulate embeddings and quantifies classification uncertainty. Evaluated on a private dataset for CHD detection and on a large public dataset, EchoNet-Dynamic, for systolic dysfunction, TPA achieves state-of-the-art macro F1 scores of 85.40% for CHD diagnosis, while also reducing expected calibration error by 5.38% and adaptive ECE by 6.8%. On EchoNet-Dynamic's three-class task, it boosts macro F1 by 4.73% (from 53.89% to 58.62%). Temporal Prompt Alignment (TPA) is a framework for fetal congenital heart defect (CHD) classification in ultrasound videos that integrates temporal modeling, prompt-aware contrastive learning, and uncertainty quantification.

Ultrasound Classification Cardiac Methodology In Silico Academic Lab Benchmark SOTA

Clinical and Economic Evaluation of a Real-Time Chest X-Ray Computer-Aided Detection System for Misplaced Endotracheal and Nasogastric Tubes and Pneumothorax in Emergency and Critical Care Settings: Protocol for a Cluster Randomized Controlled Trial.

Tsai CL, Chu TC, Wang CH, Chang WT, Tsai MS, Ku SC, Lin YH, Tai HC, Kuo SW, Wang KC, Chao A, Tang SC, Liu WL, Tsai MH, Wang TA, Chuang SL, Lee YC, Kuo LC, Chen CJ, Kao JH, Wang W, Huang CH

•papers•Aug 20 2025

Advancements in artificial intelligence (AI) have driven substantial breakthroughs in computer-aided detection (CAD) for chest x-ray (CXR) imaging. The National Taiwan University Hospital research team previously developed an AI-based emergency CXR system (Capstone project), which led to the creation of a CXR module. This CXR module has an established model supported by extensive research and is ready for application in clinical trials without requiring additional model training. This study will use 3 submodules of the system: detection of misplaced endotracheal tubes, detection of misplaced nasogastric tubes, and identification of pneumothorax. This study aims to apply a real-time CXR CAD system in emergency and critical care settings to evaluate its clinical and economic benefits without requiring additional CXR examinations or altering standard care and procedures. The study will evaluate the impact of CAD system on mortality reduction, postintubation complications, hospital stay duration, workload, and interpretation time, as wells as conduct a cost-effectiveness comparison with standard care. This study adopts a pilot trial and cluster randomized controlled trial design, with random assignment conducted at the ward level. In the intervention group, units are granted access to AI diagnostic results, while the control group continues standard care practices. Consent will be obtained from attending physicians, residents, and advanced practice nurses in each participating ward. Once consent is secured, these health care providers in the intervention group will be authorized to use the CAD system. Intervention units will have access to AI-generated interpretations, whereas control units will maintain routine medical procedures without access to the AI diagnostic outputs. The study was funded in September 2024. Data collection is expected to last from January 2026 to December 2027. This study anticipates that the real-time CXR CAD system will automate the identification and detection of misplaced endotracheal and nasogastric tubes on CXRs, as well as assist clinicians in diagnosing pneumothorax. By reducing the workload of physicians, the system is expected to shorten the time required to detect tube misplacement and pneumothorax, decrease patient mortality and hospital stays, and ultimately lower health care costs. PRR1-10.2196/72928.

X-Ray Detection Chest RCT Clinical Pilot Academic Lab Benchmark SOTA

Deep Learning Model for Breast Shear Wave Elastography to Improve Breast Cancer Diagnosis (INSPiRED 006): An International, Multicenter Analysis.

Cai L, Pfob A, Barr RG, Duda V, Alwafai Z, Balleyguier C, Clevert DA, Fastner S, Gomez C, Goncalo M, Gruber I, Hahn M, Kapetas P, Nees J, Ohlinger R, Riedel F, Rutten M, Stieber A, Togawa R, Sidey-Gibbons C, Tozaki M, Wojcinski S, Heil J, Golatta M

•papers•Aug 20 2025

Shear wave elastography (SWE) has been investigated as a complement to B-mode ultrasound for breast cancer diagnosis. Although multicenter trials suggest benefits for patients with Breast Imaging Reporting and Data System (BI-RADS) 4(a) breast masses, widespread adoption remains limited because of the absence of validated velocity thresholds. This study aims to develop and validate a deep learning (DL) model using SWE images (artificial intelligence [AI]-SWE) for BI-RADS 3 and 4 breast masses and compare its performance with human experts using B-mode ultrasound. We used data from an international, multicenter trial (ClinicalTrials.gov identifier: NCT02638935) evaluating SWE in women with BI-RADS 3 or 4 breast masses across 12 institutions in seven countries. Images from 11 sites were used to develop an EfficientNetB1-based DL model. An external validation was conducted using data from the 12th site. Another validation was performed using the latest SWE software from a separate institutional cohort. Performance metrics included sensitivity, specificity, false-positive reduction, and area under the receiver operator curve (AUROC). The development set included 924 patients (4,026 images); the external validation sets included 194 patients (562 images) and 176 patients (188 images, latest SWE software). AI-SWE achieved an AUROC of 0.94 (95% CI, 0.91 to 0.96) and 0.93 (95% CI, 0.88 to 0.98) in the two external validation sets. Compared with B-mode ultrasound, AI-SWE significantly reduced false-positive rates by 62.1% (20.4% [30/147] v 53.8% [431/801]; P < .001) and 38.1% (33.3% [14/42] v 53.8% [431/801]; P < .001), with comparable sensitivity (97.9% [46/47] and 97.8% [131/134] v 98.1% [311/317]; P = .912 and P = .810). AI-SWE demonstrated accuracy comparable with human experts in malignancy detection while significantly reducing false-positive imaging findings (ie, unnecessary biopsies). Future studies should explore its integration into multimodal breast cancer diagnostics.

Ultrasound Classification Breast Retrospective Clinical In Silico Consortium Benchmark SOTA

Unexpected early pulmonary thrombi in war injured patients.

Sasson I, Sorin V, Ziv-Baran T, Marom EM, Czerniawski E, Adam SZ, Aviram G

•papers•Aug 20 2025

Pulmonary embolism is commonly associated with deep vein thrombosis and the components of Virchow's triad: hypercoagulability, stasis, and endothelial injury. High-risk patients are traditionally those with prolonged immobility and hypercoagulability. Recent findings of pulmonary thrombosis (PT) in healthy combat soldiers, found on CT performed for initial trauma assessment, challenge this assumption. The aim of this study was to investigate the prevalence and characteristics of PT detected in acute traumatic war injuries, and evaluate the effectiveness of an artificial intelligence (AI) algorithm in these settings. This retrospective study analyzed immediate post-trauma CT scans of war-injured patients aged 18-45, from two tertiary hospitals between October 7, 2023, and January 7, 2024. Thrombi were retrospectively detected using AI software and confirmed by two senior radiologists. Findings were compared to the original reports. Clinical and injury-related data were analyzed. Of 190 patients (median age 24, IQR (21.0-30.0), 183 males), AI identified 10 confirmed PT patients (5.6%), six (60%) of whom were not originally diagnosed. The only statistically significant difference between PT and non-PT patients was increased complexity and severity of injuries (higher Injury Severity Score, median (IQR) 21.0 (20.0-21.0) vs 9.0 (4.0-14.5), p = 0.01, accordingly). Despite the presence of thrombi, significant right ventricular dilatation was absent in all patients. This report of early PT in war-injured patients provides a unique opportunity to characterize these findings. PT occurs more frequently than anticipated, without clinical suspicion, highlighting the need for improved radiologists' awareness and the crucial role of AI systems as diagnostic support tools. Question What is the prevalence, and what are the radiological characteristics of arterial clotting within the pulmonary arteries in young acute trauma patients? Findings A surprisingly high occurrence of PT with a high rate of missed diagnoses by radiologists. All cases did not presented right ventricular dysfunction. Clinical relevance PT is a distinct clinical entity separate from traditional venous thromboembolism, which raises the need for further investigation of the appropriate treatment paradigm.

CT Detection Chest Retrospective Clinical In Silico Benchmark SOTA

Characterizing the Impact of Training Data on Generalizability: Application in Deep Learning to Estimate Lung Nodule Malignancy Risk.

Obreja B, Bosma J, Venkadesh KV, Saghir Z, Prokop M, Jacobs C

•papers•Aug 20 2025

"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. Purpose To investigate the relationship between training data volume and performance of a deep learning AI algorithm developed to assess the malignancy risk of pulmonary nodules detected on low-dose CT scans in lung cancer screening. Materials and Methods This retrospective study used a dataset of 16077 annotated nodules (1249 malignant, 14828 benign) from the National Lung Screening Trial (NLST) to systematically train an AI algorithm for pulmonary nodule malignancy risk prediction across various stratified subsets ranging from 1.25% to the full dataset. External testing was conducted using data from the Danish Lung Cancer Screening Trial (DLCST) to determine the amount of training data at which the performance of the AI was statistically non-inferior to the AI trained on the full NLST cohort. A size-matched cancer-enriched subset of DLCST, where each malignant nodule had been paired in diameter with the closest two benign nodules, was used to investigate the amount of training data at which the performance of the AI algorithm was statistically non-inferior to the average performance of 11 clinicians. Results The external testing set included 599 participants (mean age 57.65 (SD 4.84) for females and mean age 59.03 (SD 4.94) for males) with 883 nodules (65 malignant, 818 benign). The AI achieved a mean AUC of 0.92 [95% CI: 0.88, 0.96] on the DLCST cohort when trained on the full NLST dataset. Training with 80% of NLST data resulted in non-inferior performance (mean AUC 0.92 [95%CI: 0.89, 0.96], P = .005). On the size-matched DLCST subset (59 malignant, 118 benign), the AI reached non-inferior clinician-level performance (mean AUC 0.82 [95% CI: 0.77, 0.86]) with 20% of the training data (P = .02). Conclusion The deep learning AI algorithm demonstrated excellent performance in assessing pulmonary nodule malignancy risk, achieving clinical level performance with a fraction of the training data and reaching peak performance before utilizing the full dataset. ©RSNA, 2025.

CT Classification Chest Retrospective Clinical In Silico Benchmark SOTA

Filter Papers

Tags

When Age Is More Than a Number: Acceleration of Brain Aging in Neurodegenerative Diseases.

Multimodal Integration in Health Care: Development With Applications in Disease Management.

Vision Transformer Autoencoders for Unsupervised Representation Learning: Revealing Novel Genetic Associations through Learned Sparse Attention Patterns

Deep Learning-Assisted Skeletal Muscle Radiation Attenuation at C3 Predicts Survival in Head and Neck Cancer

From Detection to Diagnosis: An Advanced Transfer Learning Pipeline Using YOLO11 with Morphological Post-Processing for Brain Tumor Analysis for MRI Images.

TPA: Temporal Prompt Alignment for Fetal Congenital Heart Defect Classification

Clinical and Economic Evaluation of a Real-Time Chest X-Ray Computer-Aided Detection System for Misplaced Endotracheal and Nasogastric Tubes and Pneumothorax in Emergency and Critical Care Settings: Protocol for a Cluster Randomized Controlled Trial.

Deep Learning Model for Breast Shear Wave Elastography to Improve Breast Cancer Diagnosis (INSPiRED 006): An International, Multicenter Analysis.

Unexpected early pulmonary thrombi in war injured patients.

Characterizing the Impact of Training Data on Generalizability: Application in Deep Learning to Estimate Lung Nodule Malignancy Risk.

Ready to Sharpen Your Edge?