Latest Papers on Radiology AI. Tags: None, Order: Best Match, Limit: 10.

Diagnostic accuracy of machine learning-based magnetic resonance imaging models in breast cancer classification: a systematic review and meta-analysis.

Zhang J, Wu Q, Lei P, Zhu X, Li B

•papers•Jun 11 2025

This meta-analysis evaluates the diagnostic accuracy of machine learning (ML)-based magnetic resonance imaging (MRI) models in distinguishing benign from malignant breast lesions and explores factors influencing their performance. A systematic search of PubMed, Embase, Cochrane Library, Scopus, and Web of Science identified 12 eligible studies (from 3,739 records) up to August 2024. Data were extracted to calculate sensitivity, specificity, and area under the curve (AUC) using bivariate models in R 4.4.1. Study quality was assessed via QUADAS-2. Pooled sensitivity and specificity were 0.86 (95% CI: 0.82-0.90) and 0.82 (95% CI: 0.78-0.86), respectively, with an overall AUC of 0.90 (95% CI: 0.85-0.90). Diagnostic odds ratio (DOR) was 39.11 (95% CI: 25.04-53.17). Support vector machine (SVM) classifiers outperformed Naive Bayes, with higher sensitivity (0.88 vs. 0.86) and specificity (0.82 vs. 0.78). Heterogeneity was primarily attributed to MRI equipment (P = 0.037). ML-based MRI models demonstrate high diagnostic accuracy for breast cancer classification, with pooled sensitivity of 0.86 (95% CI: 0.82-0.90), specificity of 0.82 (95% CI: 0.78-0.86), and AUC of 0.90 (95% CI: 0.85-0.90). These results support their clinical utility as screening and diagnostic adjuncts, while highlighting the need for standardized protocols to improve generalizability.

MRI Classification Breast Meta Analysis In Silico None Academic Lab Benchmark SOTA

Non-enhanced CT deep learning model for differentiating lung adenocarcinoma from tuberculoma: a multicenter diagnostic study.

Zhang G, Shang L, Li S, Zhang J, Zhang Z, Zhang X, Qian R, Yang K, Li X, Liu Y, Wu Y, Pu H, Cao Y, Man Q, Kong W

•papers•Jun 11 2025

To develop and validate a deep learning model based on three-dimensional features (DL_3D) for distinguishing lung adenocarcinoma (LUAD) from tuberculoma (TBM). A total of 1160 patients were collected from three hospitals. A vision transformer network-based DL_3D model was trained, and its performance in differentiating LUAD from TBM was evaluated using validation and external test sets. The performance of the DL_3D model was compared with that of two-dimensional features (DL_2D), radiomics, and six radiologists. Diagnostic performance was assessed using the area under the receiver operating characteristic curves (AUCs) analysis. The study included 840 patients in the training set (mean age, 54.8 years [range, 19-86 years]; 514 men), 210 patients in the validation set (mean age, 54.3 years [range, 18-86 years]; 128 men), and 110 patients in the external test set (mean age, 54.7 years [range, 22-88 years]; 51 men). In both the validation and external test sets, DL_3D exhibited excellent diagnostic performance (AUCs, 0.895 and 0.913, respectively). In the test set, the DL_3D model showed better performance (AUC, 0.913; 95% CI: 0.854, 0.973) than the DL_2D (AUC, 0.804, 95% CI: 0.722, 0.886; p < 0.001), radiomics (AUC, 0.676, 95% CI: 0.574, 0.777; p < 0.001), and six radiologists (AUCs, 0.692 to 0.810; p value range < 0.001-0.035). The DL_3D model outperforms expert radiologists in distinguishing LUAD from TBM. Question Can a deep learning model perform in differentiating LUAD from TBM on non-enhanced CT images? Findings The DL_3D model demonstrated higher diagnostic performance than the DL_2D model, radiomics model, and six radiologists in differentiating LUAD and TBM. Clinical relevance The DL_3D model could accurately differentiate between LUAD and TBM, which can help clinicians make personalized treatment plans.

CT Classification Chest Retrospective Clinical In Silico None Academic Lab Benchmark SOTA

A fully open AI foundation model applied to chest radiography.

Ma D, Pang J, Gotway MB, Liang J

•papers•Jun 11 2025

Chest radiography frequently serves as baseline imaging for most lung diseases1. Deep learning has great potential for automating the interpretation of chest radiography2. However, existing chest radiographic deep learning models are limited in diagnostic scope, generalizability, adaptability, robustness and extensibility. To overcome these limitations, we have developed Ark+, a foundation model applied to chest radiography and pretrained by cyclically accruing and reusing the knowledge from heterogeneous expert labels in numerous datasets. Ark+ excels in diagnosing thoracic diseases. It expands the diagnostic scope and addresses potential misdiagnosis. It can adapt to evolving diagnostic needs and respond to novel diseases. It can learn rare conditions from a few samples and transfer to new diagnostic settings without training. It tolerates data biases and long-tailed distributions, and it supports federated learning to preserve privacy. All codes and pretrained models have been released, so that Ark+ is open for fine-tuning, local adaptation and improvement. It is extensible to several modalities. Thus, it is a foundation model for medical imaging. The exceptional capabilities of Ark+ stem from our insight: aggregating various datasets diversifies the patient populations and accrues knowledge from many experts to yield unprecedented performance while reducing annotation costs3. The development of Ark+ reveals that open models trained by accruing and reusing knowledge from heterogeneous expert annotations with a multitude of public (big or small) datasets can surpass the performance of proprietary models trained on large data. We hope that our findings will inspire more researchers to share code and datasets or federate privacy-preserving data to create open foundation models with diverse, global expertise and patient populations, thus accelerating open science and democratizing AI for medicine.

X-Ray Classification Chest Methodology In Silico None Academic Lab Open Code Open Dataset Benchmark SOTA

Real-World Diagnostic Performance and Clinical Utility of Artificial-Intelligence-Assisted Interpretation for Detection of Lung Metastasis on CT in Patients With Colorectal Cancer.

Jang S, Kim J, Lee JS, Jeong Y, Nam JG, Kim J, Lee KW

•papers•Jun 11 2025

Background: Studies of artificial intelligence (AI) for lung nodule detection on CT have primarily been conducted in investigational settings and/or focused on lung cancer screening. Objective: To evaluate the impact of AI assistance on radiologists' diagnostic performance for detecting lung metastases on chest CT in patients with colorectal cancer (CRC) in real-world clinical practice and to assess the clinical utility of AI assistance in this setting. Methods: This retrospective study included patients with CRC who underwent chest CT as surveillance for lung metastasis from May 2020 to December 2020 (conventional interpretation) or May 2022 to December 2022 (AI-assisted interpretation). Between periods, the institution implemented a commercial AI lung nodule detection system. During the second period, radiologists interpreted examinations concurrently with AI-generated reports, using clinical judgment regarding whether to report AI-detected nodules. The reference standard for metastasis incorporated pathologic and clinical follow-up criteria. Diagnostic performance (sensitivity, specificity, accuracy), and clinical utility (diagnostic yield, false-referral rate, management changes after positive reports) were compared between groups based on clinical radiology reports. Net benefit was estimated using decision curve analysis equation. Standalone AI interpretation was evaluated. Results: The conventional interpretation group included 647 patients (mean age, 64±11 years; 394 men, 253 women; metastasis prevalence, 4.3%); AI-assisted interpretation group included 663 patients (mean age, 63±12 years; 381 men, 282 women; metastasis prevalence, 4.4%). The AI-assisted interpretation group compared with the conventional interpretation group showed higher sensitivity (72.4% vs 32.1%; p=.008), accuracy (98.5% vs 96.0%; p=.005), and frequency of management changes (55.2% vs 25.0%, p=.02), without significant difference in specificity (99.7% vs 98.9%; p=.11), diagnostic yield (3.2% vs 1.4%, p=.30) or false-referral rate (0.3% vs 1.1%, p=.10). AI-assisted interpretation had positive estimated net benefit across outcome ratios. Standalone AI correctly detected metastasis in 24 of 29 patients but had 381 false-positive detections in 634 patients without metastasis; only one AI false-positive was reported as positive by interpretating radiologists. Conclusion: AI assistance yielded increased sensitivity, accuracy, and frequency of management changes, without significantly changed specificity. False-positive AI results minimally impacted radiologists' interpretations. Clinical Impact: The findings support clinical utility of AI assistance for CRC metastasis surveillance.

CT Detection Chest Retrospective Clinical Clinical Pilot None Academic Lab

Predicting pragmatic language abilities from brain structural MRI in preschool children with ASD by NBS-Predict.

Qian L, Ding N, Fang H, Xiao T, Sun B, Gao H, Ke X

•papers•Jun 11 2025

Pragmatics plays a crucial role in effectively conveying messages across various social communication contexts. This aspect is frequently highlighted in the challenges experienced by children diagnosed with autism spectrum disorder (ASD). Notably, there remains a paucity of research investigating how the structural connectome (SC) predicts pragmatic language abilities within this population. Using diffusion tensor imaging (DTI) and deterministic tractography, we constructed the whole-brain white matter structural network (WMSN) in a cohort comprising 92 children with ASD and 52 typically developing (TD) preschoolers, matched for age and gender. We employed network-based statistic (NBS)-Predict, a novel methodology that integrates machine learning (ML) with NBS, to identify dysconnected subnetworks associated with ASD, and then to predict pragmatic language abilities based on the SC derived from the whole-brain WMSN in the ASD group. Initially, NBS-Predict identified a subnetwork characterized by 42 reduced connections across 37 brain regions (p = 0.01), achieving a highest classification accuracy of 79.4% (95% CI: 0.791 ~ 0.796). The dysconnected regions were predominantly localized within the brain's frontotemporal and subcortical areas, with the right superior medial frontal gyrus (SFGmed.R) emerging as the region exhibiting the most extensive disconnection. Moreover, NBS-Predict demonstrated that the optimal correlation coefficient between the predicted pragmatic language scores and the actual measured scores was 0.220 (95% CI: 0.174 ~ 0.265). This analysis revealed a significant association between the pragmatic language abilities of the ASD cohort and the white matter connections linking the SFGmed.R with the bilateral anterior cingulate gyrus (ACG). In summary, our findings suggest that the subnetworks displaying the most significant abnormal connections were concentrated in the frontotemporal and subcortical regions among the ASD group. Furthermore, the observed abnormalities in the white matter connection pathways between the SFGmed.R and ACG may underlie the neurobiological basis for pragmatic language deficits in preschool children with ASD.

MRI Classification Neurological Retrospective Clinical In Silico None Academic Lab

Using a Large Language Model for Breast Imaging Reporting and Data System Classification and Malignancy Prediction to Enhance Breast Ultrasound Diagnosis: Retrospective Study.

Miaojiao S, Xia L, Xian Tao Z, Zhi Liang H, Sheng C, Songsong W

•papers•Jun 11 2025

Breast ultrasound is essential for evaluating breast nodules, with Breast Imaging Reporting and Data System (BI-RADS) providing standardized classification. However, interobserver variability among radiologists can affect diagnostic accuracy. Large language models (LLMs) like ChatGPT-4 have shown potential in medical imaging interpretation. This study explores its feasibility in improving BI-RADS classification consistency and malignancy prediction compared to radiologists. This study aims to evaluate the feasibility of using LLMs, particularly ChatGPT-4, to assess the consistency and diagnostic accuracy of standardized breast ultrasound imaging reports, using pathology as the reference standard. This retrospective study analyzed breast nodule ultrasound data from 671 female patients (mean 45.82, SD 9.20 years; range 26-75 years) who underwent biopsy or surgical excision at our hospital between June 2019 and June 2024. ChatGPT-4 was used to interpret BI-RADS classifications and predict benign versus malignant nodules. The study compared the model's performance to that of two senior radiologists (≥15 years of experience) and two junior radiologists (<5 years of experience) using key diagnostic metrics, including accuracy, sensitivity, specificity, area under the receiver operating characteristic curve, P values, and odds ratios with 95% CIs. Two diagnostic models were evaluated: (1) image interpretation model, where ChatGPT-4 classified nodules based on BI-RADS features, and (2) image-to-text-LLM model, where radiologists provided textual descriptions, and ChatGPT-4 determined malignancy probability based on keywords. Radiologists were blinded to pathological outcomes, and BI-RADS classifications were finalized through consensus. ChatGPT-4 achieved an overall BI-RADS classification accuracy of 96.87%, outperforming junior radiologists (617/671, 91.95% and 604/671, 90.01%, P<.01). For malignancy prediction, ChatGPT-4 achieved an area under the receiver operating characteristic curve of 0.82 (95% CI 0.79-0.85), an accuracy of 80.63% (541/671 cases), a sensitivity of 90.56% (259/286 cases), and a specificity of 73.51% (283/385 cases). The image interpretation model demonstrated performance comparable to senior radiologists, while the image-to-text-LLM model further improved diagnostic accuracy for all radiologists, increasing their sensitivity and specificity significantly (P<.001). Statistical analyses, including the McNemar test and DeLong test, confirmed that ChatGPT-4 outperformed junior radiologists (P<.01) and showed noninferiority compared to senior radiologists (P>.05). Pathological diagnoses served as the reference standard, ensuring robust evaluation reliability. Integrating ChatGPT-4 into an image-to-text-LLM workflow improves BI-RADS classification accuracy and supports radiologists in breast ultrasound diagnostics. These results demonstrate its potential as a decision-support tool to enhance diagnostic consistency and reduce variability.

Ultrasound Classification Breast Retrospective Clinical In Silico None Academic Lab GenAI

A Multi-Resolution Hybrid CNN-Transformer Network With Scale-Guided Attention for Medical Image Segmentation.

Zhu S, Li Y, Dai X, Mao T, Wei L, Yan Y

•papers•Jun 11 2025

Medical image segmentation remains a challenging task due to the intricate nature of anatomical structures and the wide range of target sizes. In this paper, we propose a novel U -shaped segmentation network that integrates CNN and Transformer architectures to address these challenges. Specifically, our network architecture consists of three main components. In the encoder, we integrate an attention-guided multi-scale feature extraction module with a dual-path downsampling block to learn hierarchical features. The decoder employs an advanced feature aggregation and fusion module that effectively models inter-dependencies across different hierarchical levels. For the bottleneck, we explore multi-scale feature activation and multi-layer context Transformer modules to facilitate high-level semantic feature learning and global context modeling. Additionally, we implement a multi-resolution input-output strategy throughout the network to enrich feature representations and ensure fine-grained segmentation outputs across different scales. The experimental results on diverse multi-modal medical image datasets (ultrasound, gastrointestinal polyp, MR, and CT images) demonstrate that our approach can achieve superior performance over state-of-the-art methods in both quantitative measurements and qualitative assessments. The code is available at https://github.com/zsj0577/MSAGHNet.

Mixed Modality Segmentation Other Methodology In Silico None Academic Lab Open Code

Towards Practical Alzheimer's Disease Diagnosis: A Lightweight and Interpretable Spiking Neural Model

Changwei Wu, Yifei Chen, Yuxin Du, Jinying Zong, Jie Dong, Mingxuan Liu, Yong Peng, Jin Fan, Feiwei Qin, Changmiao Wang

•preprint•Jun 11 2025

Early diagnosis of Alzheimer's Disease (AD), especially at the mild cognitive impairment (MCI) stage, is vital yet hindered by subjective assessments and the high cost of multimodal imaging modalities. Although deep learning methods offer automated alternatives, their energy inefficiency and computational demands limit real-world deployment, particularly in resource-constrained settings. As a brain-inspired paradigm, spiking neural networks (SNNs) are inherently well-suited for modeling the sparse, event-driven patterns of neural degeneration in AD, offering a promising foundation for interpretable and low-power medical diagnostics. However, existing SNNs often suffer from weak expressiveness and unstable training, which restrict their effectiveness in complex medical tasks. To address these limitations, we propose FasterSNN, a hybrid neural architecture that integrates biologically inspired LIF neurons with region-adaptive convolution and multi-scale spiking attention. This design enables sparse, efficient processing of 3D MRI while preserving diagnostic accuracy. Experiments on benchmark datasets demonstrate that FasterSNN achieves competitive performance with substantially improved efficiency and stability, supporting its potential for practical AD screening. Our source code is available at https://github.com/wuchangw/FasterSNN.

MRI Classification Neurological Methodology In Silico None Academic Lab Open Code

CINeMA: Conditional Implicit Neural Multi-Modal Atlas for a Spatio-Temporal Representation of the Perinatal Brain

Maik Dannecker, Vasiliki Sideri-Lampretsa, Sophie Starck, Angeline Mihailov, Mathieu Milh, Nadine Girard, Guillaume Auzias, Daniel Rueckert

•preprint•Jun 11 2025

Magnetic resonance imaging of fetal and neonatal brains reveals rapid neurodevelopment marked by substantial anatomical changes unfolding within days. Studying this critical stage of the developing human brain, therefore, requires accurate brain models-referred to as atlases-of high spatial and temporal resolution. To meet these demands, established traditional atlases and recently proposed deep learning-based methods rely on large and comprehensive datasets. This poses a major challenge for studying brains in the presence of pathologies for which data remains scarce. We address this limitation with CINeMA (Conditional Implicit Neural Multi-Modal Atlas), a novel framework for creating high-resolution, spatio-temporal, multimodal brain atlases, suitable for low-data settings. Unlike established methods, CINeMA operates in latent space, avoiding compute-intensive image registration and reducing atlas construction times from days to minutes. Furthermore, it enables flexible conditioning on anatomical features including GA, birth age, and pathologies like ventriculomegaly (VM) and agenesis of the corpus callosum (ACC). CINeMA supports downstream tasks such as tissue segmentation and age prediction whereas its generative properties enable synthetic data creation and anatomically informed data augmentation. Surpassing state-of-the-art methods in accuracy, efficiency, and versatility, CINeMA represents a powerful tool for advancing brain research. We release the code and atlases at https://github.com/m-dannecker/CINeMA.

MRI Image Synthesis Neurological Methodology In Silico None Academic Lab Open Code GenAI

HSENet: Hybrid Spatial Encoding Network for 3D Medical Vision-Language Understanding

Yanzhao Shi, Xiaodan Zhang, Junzhong Ji, Haoning Jiang, Chengxin Zheng, Yinong Wang, Liangqiong Qu

•preprint•Jun 11 2025

Automated 3D CT diagnosis empowers clinicians to make timely, evidence-based decisions by enhancing diagnostic accuracy and workflow efficiency. While multimodal large language models (MLLMs) exhibit promising performance in visual-language understanding, existing methods mainly focus on 2D medical images, which fundamentally limits their ability to capture complex 3D anatomical structures. This limitation often leads to misinterpretation of subtle pathologies and causes diagnostic hallucinations. In this paper, we present Hybrid Spatial Encoding Network (HSENet), a framework that exploits enriched 3D medical visual cues by effective visual perception and projection for accurate and robust vision-language understanding. Specifically, HSENet employs dual-3D vision encoders to perceive both global volumetric contexts and fine-grained anatomical details, which are pre-trained by dual-stage alignment with diagnostic reports. Furthermore, we propose Spatial Packer, an efficient multimodal projector that condenses high-resolution 3D spatial regions into a compact set of informative visual tokens via centroid-based compression. By assigning spatial packers with dual-3D vision encoders, HSENet can seamlessly perceive and transfer hybrid visual representations to LLM's semantic space, facilitating accurate diagnostic text generation. Experimental results demonstrate that our method achieves state-of-the-art performance in 3D language-visual retrieval (39.85% of R@100, +5.96% gain), 3D medical report generation (24.01% of BLEU-4, +8.01% gain), and 3D visual question answering (73.60% of Major Class Accuracy, +1.99% gain), confirming its effectiveness. Our code is available at https://github.com/YanzhaoShi/HSENet.

CT LLM Radiology Report Whole Body Methodology In Silico None Academic Lab GenAI Open Code

Diagnostic accuracy of machine learning-based magnetic resonance imaging models in breast cancer classification: a systematic review and meta-analysis.

Non-enhanced CT deep learning model for differentiating lung adenocarcinoma from tuberculoma: a multicenter diagnostic study.

A fully open AI foundation model applied to chest radiography.

Real-World Diagnostic Performance and Clinical Utility of Artificial-Intelligence-Assisted Interpretation for Detection of Lung Metastasis on CT in Patients With Colorectal Cancer.

Predicting pragmatic language abilities from brain structural MRI in preschool children with ASD by NBS-Predict.

Using a Large Language Model for Breast Imaging Reporting and Data System Classification and Malignancy Prediction to Enhance Breast Ultrasound Diagnosis: Retrospective Study.

A Multi-Resolution Hybrid CNN-Transformer Network With Scale-Guided Attention for Medical Image Segmentation.

Towards Practical Alzheimer's Disease Diagnosis: A Lightweight and Interpretable Spiking Neural Model

CINeMA: Conditional Implicit Neural Multi-Modal Atlas for a Spatio-Temporal Representation of the Perinatal Brain

HSENet: Hybrid Spatial Encoding Network for 3D Medical Vision-Language Understanding

Ready to Sharpen Your Edge?