Latest Papers on Radiology AI. Tags: Classification

Does Bigger Mean Better? Comparitive Analysis of CNNs and Biomedical Vision Language Modles in Medical Diagnosis

Ran Tong, Jiaqi Liu, Su Liu, Jiexi Xu, Lanruo Wang, Tong Wang

•preprint•Oct 1 2025

The accurate interpretation of chest radiographs using automated methods is a critical task in medical imaging. This paper presents a comparative analysis between a supervised lightweight Convolutional Neural Network (CNN) and a state-of-the-art, zero-shot medical Vision-Language Model (VLM), BiomedCLIP, across two distinct diagnostic tasks: pneumonia detection on the PneumoniaMNIST benchmark and tuberculosis detection on the Shenzhen TB dataset. Our experiments show that supervised CNNs serve as highly competitive baselines in both cases. While the default zero-shot performance of the VLM is lower, we demonstrate that its potential can be unlocked via a simple yet crucial remedy: decision threshold calibration. By optimizing the classification threshold on a validation set, the performance of BiomedCLIP is significantly boosted across both datasets. For pneumonia detection, calibration enables the zero-shot VLM to achieve a superior F1-score of 0.8841, surpassing the supervised CNN's 0.8803. For tuberculosis detection, calibration dramatically improves the F1-score from 0.4812 to 0.7684, bringing it close to the supervised baseline's 0.7834. This work highlights a key insight: proper calibration is essential for leveraging the full diagnostic power of zero-shot VLMs, enabling them to match or even outperform efficient, task-specific supervised models.

X-Ray Classification Chest Methodology In Silico Benchmark SOTA

Dissecting real-world memory clinical cohort heterogeneity: analysis of neuroanatomical subtypes using HYDRA.

An GZ, Xie Y, Benzinger TLS, Gordon BA, Sotiras A

•papers•Oct 1 2025

There is significant evidence for neuroanatomical heterogeneity in neurodegenerative disorders, which has been demonstrated predominantly through analyses of well-characterized research cohorts. Despite the known diversity in clinical presentations among patients attending memory clinics, studies exploring neuroanatomical heterogeneity in such clinically diverse groups remain sparse. To address this gap, we applied the semi-supervised Heterogeneity through Discriminative Analysis (HYDRA) (Neuroimage 145:346-364 2017) machine learning method to magnetic resonance imaging (MRI) data from the Open Access Series of Imaging Studies (OASIS) (NeuroImage 26:102248 2020) to uncover patterns of neurostructural heterogeneity in memory clinic attendees. Cross-validation was used to assess clustering stability via the Adjusted Rand Index (ARI), Silhouette Score, and Calinski-Harabasz Index (CHI). We performed survival analyses using Kaplan-Meier curves and mixed-effects models for longitudinal cognitive data (e.g., memory, executive function, and language assessments) to examine differences in disease progression. Cross-validation analyses indicated two highly stable subtypes of cognitively impaired individuals (ARI = 0.552), exhibiting significant neuroanatomical differences. Subtype 1, termed the Temporal-Sparing Atrophy (TSA) Subtype, was defined by relatively mild atrophy, especially in temporal areas, with slower cognitive decline and preserved Function across most domains. Subtype 2, termed the Temporal-Parietal Predominated Atrophy (TPPA) Subtype, was marked by notable alterations in areas critically affected in neurodegenerative disorders. These included key areas critical for executive function and memory, such as the frontal, temporal, and parietal cortices including the precuneus. Longitudinal analysis of neuroimaging and cognitive data revealed contrasting trajectories. The TSA Subtype demonstrated a gradual decline in cognitive functions over time, particularly in the assessments that are memory-focused tests. Conversely, the TPPA Subtype exhibited a more severe decline in these functions. This research illustrates that neurodegenerative diseases present a spectrum of structural brain changes rather than uniform pathology, suggesting that future research may benefit from stratified therapeutic approaches and targeted recruitment strategies for clinical trials. By leveraging detailed clinical assessments and longitudinal data, including uncertain diagnoses and Clinical Dementia Rating (CDR) scores, this study contributes to better understanding/characterizing memory clinic populations, which could help with optimizing interventions.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

Deep Learning-Based CAD System for Enhanced Breast Lesion Classification and Grading Using RFTSDP Approach.

Ghehi EN, Fallah A, Rashidi S, Dastjerdi MM

•papers•Oct 1 2025

Accurate detection of breast lesion type is crucial for optimizing treatment; however, due to the limited precision of current diagnostic methods, biopsies are often required. To address this limitation, we proposed radio frequency time series dynamic processing (RFTSDP) in 2020, which analyzes the dynamic response of tissue and the impact of scatterer displacement on RF echoes during controlled stimulations to enhance diagnostic information. We developed a vibration-generating device and collected ultrafast ultrasound data from 11 ex vivo breast tissue samples under different stimulations. Deep learning (DL) was used for automated feature extraction and lesion classification into 2, 3, and 5 categories. The performance of the convolutional neural network (CNN)-based RFTSDP method was compared with traditional machine learning techniques, which involved spectral and nonlinear feature extraction from RF time series, followed by a support vector machine (SVM). With 65 Hz vibration, the DL-based RFTSDP method achieved 99.53 ± 0.47% accuracy in classifying and grading breast lesions. CNN consistently outperformed SVM, particularly under vibratory stimulation. In 5-class classification, CNN reached 98.01% versus 95.64% for SVM, with the difference being statistically significant (P < .05). Furthermore, the CNN-based RFTSDP method showed a 28.67% improvement in classification accuracy compared to the non-stimulation condition and the analysis of focused raw data. We developed a DL-based CAD system capable of classifying and grading breast lesions. This study demonstrates that the proposed system not only enhances classification but also ensures increased stability and robustness compared to traditional methods.

Ultrasound Classification Breast Methodology In Silico Academic Lab

Automated machine learning for prostate cancer detection and Gleason score prediction using T2WI: a diagnostic multi-center study.

Jin L, Ma Z, Gao F, Li M, Li H, Geng D

•papers•Oct 1 2025

Prostate cancer (PCa) is one of the most common malignancies in men, and accurate assessment of tumor aggressiveness is crucial for treatment planning. The Gleason score (GS) remains the gold standard for risk stratification, yet it relies on invasive biopsy, which has inherent risks and sampling errors. The aim of this study was to detect PCa and non-invasively predict the GS for the early detection and stratification of clinically significant cases. We used single-modality T2-weighted imaging (T2WI) with an automatic machine-learning (ML) approach, MLJAR. The internal dataset comprised PCa patients who underwent magnetic resonance imaging (MRI) examinations at our hospital from September 2015 to June 2022 prior to prostate biopsy, surgery, radiotherapy, and endocrine therapy and whose examinations resulted in pathological findings. An external dataset from another medical center and a public challenge dataset were used for external validation. The Kolmogorov-Smirnov curve was used to evaluate the risk-differentiation ability of the PCa detection model. The area under the receiver operating characteristic curve (AUC) was calculated with confidence intervals to compare the model performance. The internal MRI dataset included 198 non-PCa and 291 PCa patients with histopathological results obtained through biopsy or surgery. External and public challenge datasets included 45 and 68 PCa patients, respectively. AUC for PCa detection in the internal-testing cohort (n = 147, PCa = 78) was 0.99. For GS prediction, AUCs were GS = 3 + 3 (0.97), GS = 3 + 4 (0.97), GS = 3 + 5 (1.0), GS = 4 + 3 (0.87), GS = 4 + 4 (0.91), GS = 4 + 5 (0.95), GS = 5 + 4 (1.0), and GS = 5 + 5 (0.99) in the internal-testing cohort (PCa = 88); GS = 3 + 3 (0.95), GS = 3 + 4 (0.76); GS = 3 + 5 (0.77), GS = 4 + 3 (0.88), GS = 4 + 4 (0.82), GS = 4 + 5 (0.87), GS = 5 + 4 (0.95), and GS = 5 + 5 (0.85) in the external-testing cohort (PCa = 45); and GS = 3 + 4 (0.89), GS = 4 + 3 (0.75), GS = 4 + 4 (0.65), and GS = 4 + 5 (0.91) in the public challenge cohort (PCa = 68). This multi-center study shows that an auto-ML model using only T2WI can accurately detect PCa and predict Gleason scores non-invasively, offering potential to reduce biopsy reliance and improve early risk stratification. These results warrant further validation and exploration for integration into clinical workflows.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab

Machine learning combined with CT-based radiomics predicts the prognosis of oesophageal squamous cell carcinoma.

Liu M, Lu R, Wang B, Fan J, Wang Y, Zhu J, Luo J

•papers•Oct 1 2025

This retrospective study aims to develop a machine learning model integrating preoperative CT radiomics and clinicopathological data to predict 3-year recurrence and recurrence patterns in postoperative oesophageal squamous cell carcinoma. Tumour regions were segmented using 3D-Slicer, and radiomic features were extracted via Python. LASSO regression selected prognostic features for model integration. Clinicopathological data include tumour length, lymph node positivity, differentiation grade, and neurovascular infiltration. Ultimately, a machine learning model was established by combining the screened imaging feature data and clinicopathological data and validating model performance. A nomogram was constructed for survival prediction, and risk stratification was carried out through the prediction results of the machine learning model and the nomogram. Survival analysis was performed for stage-based patient subgroups across risk stratifications to identify adjuvant therapy-benefiting cohorts. Patients were randomly divided into a 7:3 ratio of 368 patients in the training cohorts and 158 patients in the validation cohorts. The LASSO regression screens out 6 recurrence prediction and 9 recurrence pattern prediction features, respectively. Among 526 patients (mean age 63; 427 males), the model achieved high accuracy in predicting recurrence (training cohort AUC: 0.826 [logistic regression]/0.820 [SVM]; validation cohort: 0.830/0.825) and recurrence patterns (training:0.801/0.799; validation:0.806/0.798). Risk stratification based on a machine learning model and nomogram predictions revealed that adjuvant therapy significantly improved disease-free survival in stages II-III patients with predicted recurrence and low survival (HR 0.372, 95% CI: 0.206-0.669; p < 0.001). Machine learning models exhibit excellent performance in predicting recurrence after surgery for squamous oesophageal cancer. Radiomic features of contrast-enhanced CT imaging can predict the prognosis of patients with oesophageal squamous cell carcinoma, which in turn can help clinicians stratify risk and screen out patient populations that could benefit from adjuvant therapy, thereby aiding medical decision-making. There is a lack of prognostic models for oesophageal squamous cell carcinoma in current research. The prognostic prediction model that we have developed has high accuracy by combining radiomics features and clinicopathologic data. This model aids in risk stratification of patients and aids clinical decision-making through predictive outcomes.

CT Classification Abdominal Retrospective Clinical In Silico Academic Lab

Current and novel approaches for critical care management of aneurysmal subarachnoid hemorrhage in critical care.

Zoumprouli A, Carden R, Bilotta F

•papers•Oct 1 2025

This review highlights recent advancements and evidence-based approaches in the critical care management of aneurysmal subarachnoid hemorrhage (aSAH), focusing on developments from the past 18 months. It addresses key challenges [rebleeding prevention, delayed cerebral ischemia (DCI), hydrocephalus, transfusion strategies, and temperature management], emphasizing multidisciplinary care and personalized treatment. Recent studies underscore the importance of systolic blood pressure control (<160 mmHg) to reduce rebleeding risk before aneurysm securing. Novel prognostic tools, including the modified 5-item frailty index and quantitative imaging software, show promise in improving outcome prediction. Prophylactic lumbar drainage may reduce DCI and improve neurological outcomes, while milrinone and computed tomography perfusion-guided therapies are being explored for vasospasm management. Transfusion strategies suggest a hemoglobin threshold of 9 g/dl may optimize outcomes. Temperature management remains contentious, but consensus recommends maintaining normothermia (36.0-37.5 °C) with continuous monitoring. Advances in aSAH care emphasize precision medicine, leveraging technology [e.g. Artificial intelligence (AI), quantitative imaging], and multidisciplinary collaboration. Key unresolved questions warrant multicenter trials to validate optimal blood pressure, transfusion, and temperature targets alongside emerging therapies for DCI.

CT Classification Neurological Review In Silico GenAI

Designing a web-based application for computer-aided diagnosis of intraosseous jaw lesions and assessment of its diagnostic accuracy.

Mohammadnezhad M, Dalili Kajan Z, Hami Razavi A

•papers•Oct 1 2025

This study aimed to design a web-based application for computer-aided diagnosis (CADx) of intraosseous jaw lesions, and assess its diagnostic accuracy. In this diagnostic test study, a web-based application was designed for CADx of 19 types of intraosseous jaw lesions. To assess its diagnostic accuracy, clinical and radiographic information of 95 cases with confirmed histopathological diagnosis of intraosseous jaw lesions were retrieved from hospital archives and published literature and imported to the application by a senior dental student. The top-N accuracy, kappa value, and Brier score were calculated, and the sensitivity, specificity, positive (PPV) and negative (NPV) predictive values, and the area under the receiver operating characteristic (ROC) curve (AUC) were calculated separately for each lesion according to DeLong et al. In assessment of top-N accuracy, the designed application gave a correct differential diagnosis in 93 cases (97.89%); the correct diagnosis was at the top of the list of differential diagnoses in 78 cases (82.10%); these values were 85 (89.47%) and 67 (70.52%) for an oral radiologist. The kappa value was 0.53. The Brayer score for the prevalence match was 0.18, and the pattern match was 0.15. The results highlighted the optimally high diagnostic accuracy of the designed application, indicating that it may be reliably used for CADx of intraosseous jaw lesions, if given accurate data.

X-Ray Classification Retrospective Clinical In Silico Academic Lab

Machine Learning-Based Detection of EGFR Mutation and HER2 Overexpression in Metastatic Brain Adenocarcinoma: Systematic Review and Meta-Analysis.

Gholami Chahkand MS, Karimi MA, Aghazadeh-Habashi K, Esmaeilpour Moallem F, Mehrabanpour R, Dadkhah PA, Esmailinia R, Esfandiari N, Azarm E, Rafiei SKS, Asadi Anar M, Shahriari A

•papers•Oct 1 2025

Brain metastases (BMs) are the most common intracranial malignancy, often arising from lung, breast, and melanoma cancers. Receptor tyrosine kinases, such as EGFR and HER2, drive tumor progression and resistance to therapy. Noninvasive detection of these biomarkers, especially in brain metastases, is crucial due to challenges with traditional biopsy methods. This systematic review and meta-analysis assess machine learning (ML)-based models for detecting EGFR mutations and HER2 overexpression in metastatic brain adenocarcinoma using MRI-derived radiomic features. A systematic review and meta-analysis were conducted following PRISMA 2020 guidelines. Studies were identified via PubMed, Scopus, and Web of Science, focusing on ML applications to MRI radiomics for detecting EGFR and HER2 in brain metastases. Data on study design, imaging modality, model type, sample size, and performance metrics were extracted. Subgroup analyses were performed by model type (deep learning vs. classical ML) and sample size (<150 vs. ≥150 participants). A random-effects model was used to pool performance metrics, and risk of bias was assessed using the RoB 2 tool. STATA version 18 and Python 3.10 were used for analyses and visualizations. Of 383 identified studies, 31 (7925 participants) met the inclusion criteria. The pooled analysis showed strong diagnostic performance: AUC = 0.84, accuracy = 0.86, and sensitivity = 0.83. Subgroup analysis revealed higher AUC and accuracy in deep learning models compared with classical ML. Sensitivity analysis also indicated improved AUC in studies with larger sample sizes (≥150), though variability remained. No evidence of heterogeneity or publication bias was detected. ML models demonstrate strong diagnostic performance for detecting EGFR and HER2 in metastatic brain adenocarcinoma, supporting their potential as noninvasive diagnostic tools. However, these findings should be interpreted considering methodological heterogeneity and the limited use of external validation. Further prospective, multicenter studies are warranted to confirm their clinical applicability and generalizability.

MRI Classification Neurological Meta Analysis In Silico Academic Lab

An interpretable hybrid deep learning framework for gastric cancer diagnosis using histopathological imaging.

Ren T, Govindarajan V, Bourouis S, Wang X, Ke S

•papers•Oct 1 2025

The increasing incidence of gastric cancer and the complexity of histopathological image interpretation present significant challenges for accurate and timely diagnosis. Manual assessments are often subjective and time-intensive, leading to a growing demand for reliable, automated diagnostic tools in digital pathology. This study proposes a hybrid deep learning approach combining convolutional neural networks (CNNs) and Transformer-based architectures to classify gastric histopathological images with high precision. The model is designed to enhance feature representation and spatial contextual understanding, particularly across diverse tissue subtypes and staining variations. Three publicly available datasets-GasHisSDB, TCGA-STAD, and NCT-CRC-HE-100 K-were utilized to train and evaluate the model. Image patches were preprocessed through stain normalization, augmented using standard techniques, and fed into the hybrid model. The CNN backbone extracts local spatial features, while the Transformer encoder captures global context. Performance was assessed using fivefold cross-validation and evaluated through accuracy, F1-score, AUC, and Grad-CAM-based interpretability. The proposed model achieved a 99.2% accuracy on the GasHisSDB dataset, with a macro F1-score of 0.991 and AUC of 0.996. External validation on TCGA-STAD and NCT-CRC-HE-100 K further confirmed the model's robustness. Grad-CAM visualizations highlighted biologically relevant regions, demonstrating interpretability and alignment with expert annotations. This hybrid deep learning framework offers a reliable, interpretable, and generalizable tool for gastric cancer diagnosis. Its superior performance and explainability highlight its clinical potential for deployment in digital pathology workflows.

OCT Classification Abdominal Methodology In Silico Benchmark SOTA

CardioBench: Do Echocardiography Foundation Models Generalize Beyond the Lab?

Darya Taratynova, Ahmed Aly, Numan Saeed, Mohammad Yaqub

•preprint•Oct 1 2025

Foundation models (FMs) are reshaping medical imaging, yet their application in echocardiography remains limited. While several echocardiography-specific FMs have recently been introduced, no standardized benchmark exists to evaluate them. Echocardiography poses unique challenges, including noisy acquisitions, high frame redundancy, and limited public datasets. Most existing solutions evaluate on private data, restricting comparability. To address this, we introduce CardioBench, a comprehensive benchmark for echocardiography FMs. CardioBench unifies eight publicly available datasets into a standardized suite spanning four regression and five classification tasks, covering functional, structural, diagnostic, and view recognition endpoints. We evaluate several leading FM, including cardiac-specific, biomedical, and general-purpose encoders, under consistent zero-shot, probing, and alignment protocols. Our results highlight complementary strengths across model families: temporal modeling is critical for functional regression, retrieval provides robustness under distribution shift, and domain-specific text encoders capture physiologically meaningful axes. General-purpose encoders transfer strongly and often close the gap with probing, but struggle with fine-grained distinctions like view classification and subtle pathology recognition. By releasing preprocessing, splits, and public evaluation pipelines, CardioBench establishes a reproducible reference point and offers actionable insights to guide the design of future echocardiography foundation models.

Ultrasound Classification Cardiac Dataset Release In Silico Academic Lab Benchmark SOTA Open Dataset Reproducibility

Filter Papers

Tags

Does Bigger Mean Better? Comparitive Analysis of CNNs and Biomedical Vision Language Modles in Medical Diagnosis

Dissecting real-world memory clinical cohort heterogeneity: analysis of neuroanatomical subtypes using HYDRA.

Deep Learning-Based CAD System for Enhanced Breast Lesion Classification and Grading Using RFTSDP Approach.

Automated machine learning for prostate cancer detection and Gleason score prediction using T2WI: a diagnostic multi-center study.

Machine learning combined with CT-based radiomics predicts the prognosis of oesophageal squamous cell carcinoma.

Current and novel approaches for critical care management of aneurysmal subarachnoid hemorrhage in critical care.

Designing a web-based application for computer-aided diagnosis of intraosseous jaw lesions and assessment of its diagnostic accuracy.

Machine Learning-Based Detection of EGFR Mutation and HER2 Overexpression in Metastatic Brain Adenocarcinoma: Systematic Review and Meta-Analysis.

An interpretable hybrid deep learning framework for gastric cancer diagnosis using histopathological imaging.

CardioBench: Do Echocardiography Foundation Models Generalize Beyond the Lab?

Ready to Sharpen Your Edge?