Latest Papers on Radiology AI. Tags: Classification

Evaluating GPT-4o for emergency disposition of complex respiratory cases with pulmonology consultation: a diagnostic accuracy study.

Yıldırım C, Aykut A, Günsoy E, Öncül MV

•papers•Oct 2 2025

Large Language Models (LLMs), such as GPT-4o, are increasingly investigated for clinical decision support in emergency medicine. However, their real-world performance in disposition prediction remains insufficiently studied. This study evaluated the diagnostic accuracy of GPT-4o in predicting ED disposition-discharge, ward admission, or ICU admission-in complex emergency respiratory cases requiring pulmonology consultation and chest CT, representing a selective high-acuity subgroup of ED patients. We conducted a retrospective observational study in a tertiary ED between November 2024 and February 2025. We retrospectively included ED patients with complex respiratory presentations who underwent pulmonology consultation and chest CT, representing a selective high-acuity subgroup rather than the general ED respiratory population. GPT-4o was prompted to predict the most appropriate ED disposition using three progressively enriched input models: Model 1 (age, sex, oxygen saturation, home oxygen therapy, and venous blood gas parameters); Model 2 (Model 1 plus laboratory data); and Model 3 (Model 2 plus chest CT findings). Model performance was assessed using accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and F1 score. Among the 221 patients included, 69.2% were admitted to the ward, 9.0% to the intensive care unit (ICU), and 21.7% were discharged. For hospital admission prediction, Model 3 demonstrated the highest sensitivity (91.9%) and overall accuracy (76.5%), but the lowest specificity (20.8%). In contrast, for discharge prediction, Model 3 achieved the highest specificity (91.9%) but the lowest sensitivity (20.8%). Numerical improvements were observed across models, but none reached statistical significance (all p > 0.22). Model 1 therefore performed comparably to Models 2-3 while being less complex. Among patients who were discharged despite GPT-4o predicting admission, the 14-day ED re-presentation rates were 23.8% (5/21) for Model 1, 30.0% (9/30) for Model 2, and 28.9% (11/38) for Model 3. GPT-4o demonstrated high sensitivity in identifying ED patients requiring hospital admission, particularly those needing intensive care, when provided with progressively enriched clinical input. However, its low sensitivity for discharge prediction resulted in frequent overtriage, limiting its utility for autonomous decision-making. This proof-of-concept study demonstrates GPT-4o's capacity to stratify disposition decisions in complex respiratory cases under varying levels of limited input data. However, these findings should be interpreted in light of key limitations, including the selective high-acuity cohort and the absence of vital signs, and require prospective validation before clinical implementation.

CT Classification Chest Retrospective Clinical In Silico GenAI

Linearizing and forecasting: a reservoir computing route to digital twins of the brain

Di Antonio, G., Gili, T., Gabrielli, A., Mattia, M.

•preprint•Oct 2 2025

Exploring the dynamics of a complex system, such as the human brain, poses significant challenges due to inherent uncertainties and limited data. In this study, we enhance the capabilities of noisy linear recurrent neural networks (lRNNs) within the reservoir computing framework, demonstrating their effectiveness in creating autonomous in silico replicas - digital twins - of brain activity. Our findings reveal that the poles of the Laplace transform of high-dimensional inferred lRNNs are directly linked to the spectral properties of observed systems and to the kernels of auto-regressive models. Applying this theoretical framework to resting-state fMRI, we successfully predict and decompose BOLD signals into spatiotemporal modes of a low-dimensional latent state space confined around a single equilibrium point. lRNNs provide an interpretable proxy for clustering among subjects and different brain areas. This adaptable digital-twin framework not only enables virtual experiments but also offers computational efficiency for real-time learning, highlighting its potential for personalized medicine and intervention strategies.

MRI Classification Neurological Methodology In Silico Academic Lab

GFSR-Net: Guided Focus via Segment-Wise Relevance Network for Interpretable Deep Learning in Medical Imaging

Jhonatan Contreras, Thomas Bocklitz

•preprint•Oct 2 2025

Deep learning has achieved remarkable success in medical image analysis, however its adoption in clinical practice is limited by a lack of interpretability. These models often make correct predictions without explaining their reasoning. They may also rely on image regions unrelated to the disease or visual cues, such as annotations, that are not present in real-world conditions. This can reduce trust and increase the risk of misleading diagnoses. We introduce the Guided Focus via Segment-Wise Relevance Network (GFSR-Net), an approach designed to improve interpretability and reliability in medical imaging. GFSR-Net uses a small number of human annotations to approximate where a person would focus within an image intuitively, without requiring precise boundaries or exhaustive markings, making the process fast and practical. During training, the model learns to align its focus with these areas, progressively emphasizing features that carry diagnostic meaning. This guidance works across different types of natural and medical images, including chest X-rays, retinal scans, and dermatological images. Our experiments demonstrate that GFSR achieves comparable or superior accuracy while producing saliency maps that better reflect human expectations. This reduces the reliance on irrelevant patterns and increases confidence in automated diagnostic tools.

X-Ray Classification Chest Methodology In Silico Ethics

Multimodal Foundation Models for Early Disease Detection

Md Talha Mohsin, Ismail Abdulrashid

•preprint•Oct 2 2025

Healthcare generates diverse streams of data, including electronic health records (EHR), medical imaging, genetics, and ongoing monitoring from wearable devices. Traditional diagnostic models frequently analyze these sources in isolation, which constrains their capacity to identify cross-modal correlations essential for early disease diagnosis. Our research presents a multimodal foundation model that consolidates diverse patient data through an attention-based transformer framework. At first, dedicated encoders put each modality into a shared latent space. Then, they combine them using multi-head attention and residual normalization. The architecture is made for pretraining on many tasks, which makes it easy to adapt to new diseases and datasets with little extra work. We provide an experimental strategy that uses benchmark datasets in oncology, cardiology, and neurology, with the goal of testing early detection tasks. The framework includes data governance and model management tools in addition to technological performance to improve transparency, reliability, and clinical interpretability. The suggested method works toward a single foundation model for precision diagnostics, which could improve the accuracy of predictions and help doctors make decisions.

Mixed Modality Classification Methodology In Silico GenAI

Deep Learning-Based CAD System for Enhanced Breast Lesion Classification and Grading Using RFTSDP Approach.

Ghehi EN, Fallah A, Rashidi S, Dastjerdi MM

•papers•Oct 1 2025

Accurate detection of breast lesion type is crucial for optimizing treatment; however, due to the limited precision of current diagnostic methods, biopsies are often required. To address this limitation, we proposed radio frequency time series dynamic processing (RFTSDP) in 2020, which analyzes the dynamic response of tissue and the impact of scatterer displacement on RF echoes during controlled stimulations to enhance diagnostic information. We developed a vibration-generating device and collected ultrafast ultrasound data from 11 ex vivo breast tissue samples under different stimulations. Deep learning (DL) was used for automated feature extraction and lesion classification into 2, 3, and 5 categories. The performance of the convolutional neural network (CNN)-based RFTSDP method was compared with traditional machine learning techniques, which involved spectral and nonlinear feature extraction from RF time series, followed by a support vector machine (SVM). With 65 Hz vibration, the DL-based RFTSDP method achieved 99.53 ± 0.47% accuracy in classifying and grading breast lesions. CNN consistently outperformed SVM, particularly under vibratory stimulation. In 5-class classification, CNN reached 98.01% versus 95.64% for SVM, with the difference being statistically significant (P < .05). Furthermore, the CNN-based RFTSDP method showed a 28.67% improvement in classification accuracy compared to the non-stimulation condition and the analysis of focused raw data. We developed a DL-based CAD system capable of classifying and grading breast lesions. This study demonstrates that the proposed system not only enhances classification but also ensures increased stability and robustness compared to traditional methods.

Ultrasound Classification Breast Methodology In Silico Academic Lab

An interpretable hybrid deep learning framework for gastric cancer diagnosis using histopathological imaging.

Ren T, Govindarajan V, Bourouis S, Wang X, Ke S

•papers•Oct 1 2025

The increasing incidence of gastric cancer and the complexity of histopathological image interpretation present significant challenges for accurate and timely diagnosis. Manual assessments are often subjective and time-intensive, leading to a growing demand for reliable, automated diagnostic tools in digital pathology. This study proposes a hybrid deep learning approach combining convolutional neural networks (CNNs) and Transformer-based architectures to classify gastric histopathological images with high precision. The model is designed to enhance feature representation and spatial contextual understanding, particularly across diverse tissue subtypes and staining variations. Three publicly available datasets-GasHisSDB, TCGA-STAD, and NCT-CRC-HE-100 K-were utilized to train and evaluate the model. Image patches were preprocessed through stain normalization, augmented using standard techniques, and fed into the hybrid model. The CNN backbone extracts local spatial features, while the Transformer encoder captures global context. Performance was assessed using fivefold cross-validation and evaluated through accuracy, F1-score, AUC, and Grad-CAM-based interpretability. The proposed model achieved a 99.2% accuracy on the GasHisSDB dataset, with a macro F1-score of 0.991 and AUC of 0.996. External validation on TCGA-STAD and NCT-CRC-HE-100 K further confirmed the model's robustness. Grad-CAM visualizations highlighted biologically relevant regions, demonstrating interpretability and alignment with expert annotations. This hybrid deep learning framework offers a reliable, interpretable, and generalizable tool for gastric cancer diagnosis. Its superior performance and explainability highlight its clinical potential for deployment in digital pathology workflows.

OCT Classification Abdominal Methodology In Silico Benchmark SOTA

Automated machine learning for prostate cancer detection and Gleason score prediction using T2WI: a diagnostic multi-center study.

Jin L, Ma Z, Gao F, Li M, Li H, Geng D

•papers•Oct 1 2025

Prostate cancer (PCa) is one of the most common malignancies in men, and accurate assessment of tumor aggressiveness is crucial for treatment planning. The Gleason score (GS) remains the gold standard for risk stratification, yet it relies on invasive biopsy, which has inherent risks and sampling errors. The aim of this study was to detect PCa and non-invasively predict the GS for the early detection and stratification of clinically significant cases. We used single-modality T2-weighted imaging (T2WI) with an automatic machine-learning (ML) approach, MLJAR. The internal dataset comprised PCa patients who underwent magnetic resonance imaging (MRI) examinations at our hospital from September 2015 to June 2022 prior to prostate biopsy, surgery, radiotherapy, and endocrine therapy and whose examinations resulted in pathological findings. An external dataset from another medical center and a public challenge dataset were used for external validation. The Kolmogorov-Smirnov curve was used to evaluate the risk-differentiation ability of the PCa detection model. The area under the receiver operating characteristic curve (AUC) was calculated with confidence intervals to compare the model performance. The internal MRI dataset included 198 non-PCa and 291 PCa patients with histopathological results obtained through biopsy or surgery. External and public challenge datasets included 45 and 68 PCa patients, respectively. AUC for PCa detection in the internal-testing cohort (n = 147, PCa = 78) was 0.99. For GS prediction, AUCs were GS = 3 + 3 (0.97), GS = 3 + 4 (0.97), GS = 3 + 5 (1.0), GS = 4 + 3 (0.87), GS = 4 + 4 (0.91), GS = 4 + 5 (0.95), GS = 5 + 4 (1.0), and GS = 5 + 5 (0.99) in the internal-testing cohort (PCa = 88); GS = 3 + 3 (0.95), GS = 3 + 4 (0.76); GS = 3 + 5 (0.77), GS = 4 + 3 (0.88), GS = 4 + 4 (0.82), GS = 4 + 5 (0.87), GS = 5 + 4 (0.95), and GS = 5 + 5 (0.85) in the external-testing cohort (PCa = 45); and GS = 3 + 4 (0.89), GS = 4 + 3 (0.75), GS = 4 + 4 (0.65), and GS = 4 + 5 (0.91) in the public challenge cohort (PCa = 68). This multi-center study shows that an auto-ML model using only T2WI can accurately detect PCa and predict Gleason scores non-invasively, offering potential to reduce biopsy reliance and improve early risk stratification. These results warrant further validation and exploration for integration into clinical workflows.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab

Designing a web-based application for computer-aided diagnosis of intraosseous jaw lesions and assessment of its diagnostic accuracy.

Mohammadnezhad M, Dalili Kajan Z, Hami Razavi A

•papers•Oct 1 2025

This study aimed to design a web-based application for computer-aided diagnosis (CADx) of intraosseous jaw lesions, and assess its diagnostic accuracy. In this diagnostic test study, a web-based application was designed for CADx of 19 types of intraosseous jaw lesions. To assess its diagnostic accuracy, clinical and radiographic information of 95 cases with confirmed histopathological diagnosis of intraosseous jaw lesions were retrieved from hospital archives and published literature and imported to the application by a senior dental student. The top-N accuracy, kappa value, and Brier score were calculated, and the sensitivity, specificity, positive (PPV) and negative (NPV) predictive values, and the area under the receiver operating characteristic (ROC) curve (AUC) were calculated separately for each lesion according to DeLong et al. In assessment of top-N accuracy, the designed application gave a correct differential diagnosis in 93 cases (97.89%); the correct diagnosis was at the top of the list of differential diagnoses in 78 cases (82.10%); these values were 85 (89.47%) and 67 (70.52%) for an oral radiologist. The kappa value was 0.53. The Brayer score for the prevalence match was 0.18, and the pattern match was 0.15. The results highlighted the optimally high diagnostic accuracy of the designed application, indicating that it may be reliably used for CADx of intraosseous jaw lesions, if given accurate data.

X-Ray Classification Retrospective Clinical In Silico Academic Lab

Machine learning combined with CT-based radiomics predicts the prognosis of oesophageal squamous cell carcinoma.

Liu M, Lu R, Wang B, Fan J, Wang Y, Zhu J, Luo J

•papers•Oct 1 2025

This retrospective study aims to develop a machine learning model integrating preoperative CT radiomics and clinicopathological data to predict 3-year recurrence and recurrence patterns in postoperative oesophageal squamous cell carcinoma. Tumour regions were segmented using 3D-Slicer, and radiomic features were extracted via Python. LASSO regression selected prognostic features for model integration. Clinicopathological data include tumour length, lymph node positivity, differentiation grade, and neurovascular infiltration. Ultimately, a machine learning model was established by combining the screened imaging feature data and clinicopathological data and validating model performance. A nomogram was constructed for survival prediction, and risk stratification was carried out through the prediction results of the machine learning model and the nomogram. Survival analysis was performed for stage-based patient subgroups across risk stratifications to identify adjuvant therapy-benefiting cohorts. Patients were randomly divided into a 7:3 ratio of 368 patients in the training cohorts and 158 patients in the validation cohorts. The LASSO regression screens out 6 recurrence prediction and 9 recurrence pattern prediction features, respectively. Among 526 patients (mean age 63; 427 males), the model achieved high accuracy in predicting recurrence (training cohort AUC: 0.826 [logistic regression]/0.820 [SVM]; validation cohort: 0.830/0.825) and recurrence patterns (training:0.801/0.799; validation:0.806/0.798). Risk stratification based on a machine learning model and nomogram predictions revealed that adjuvant therapy significantly improved disease-free survival in stages II-III patients with predicted recurrence and low survival (HR 0.372, 95% CI: 0.206-0.669; p < 0.001). Machine learning models exhibit excellent performance in predicting recurrence after surgery for squamous oesophageal cancer. Radiomic features of contrast-enhanced CT imaging can predict the prognosis of patients with oesophageal squamous cell carcinoma, which in turn can help clinicians stratify risk and screen out patient populations that could benefit from adjuvant therapy, thereby aiding medical decision-making. There is a lack of prognostic models for oesophageal squamous cell carcinoma in current research. The prognostic prediction model that we have developed has high accuracy by combining radiomics features and clinicopathologic data. This model aids in risk stratification of patients and aids clinical decision-making through predictive outcomes.

CT Classification Abdominal Retrospective Clinical In Silico Academic Lab

Joint prediction of glioma molecular marker status based on GDI-PMNet.

Zhu H, Liang F, Zhao T, Cao Y, Chen Y, Yan H, Xiao X

•papers•Oct 1 2025

Determining the status of glioma molecular markers is a problem of clinical importance in medicine. Current medical-imaging-based approaches for this problem suffer from various limitations, such as incomplete fine-grained feature extraction of glioma imaging data and low prediction accuracy of molecular marker status. To address these issues, a deep learning method is presented for the simultaneous joint prediction of multi-label statuses of glioma molecular markers. Firstly, a Gradient-aware Spatially Partitioned Enhancement algorithm (GASPE) is proposed to optimize the glioma MR image preprocessing method and to enhance the local detail expression ability; secondly, a Dual Attention module with Depthwise Convolution (DADC) is constructed to improve the fine-grained feature extraction ability by combining channel attention and spatial attention; thirdly, a hybrid model PMNet is proposed, which combines the Pyramid-based Multi-Scale Feature Extraction module (PMSFEM) and the Mamba-based Projection Convolution module (MPCM) to achieve effective fusion of local and global information; finally, an Iterative Truth Calibration algorithm (ITC) is used to calibrate the joint state truth vector output by the model to optimize the accuracy of the prediction results. Based on GASPE, DADC, ITC and PMNet, the proposed method constructs the Gradient-Aware Dual Attention Iteration Truth Calibration-PMNet (GDI-PMNet) to simultaneously predict the status of glioma molecular markers (IDH1, Ki67, MGMT, P53), with accuracies of 98.31%, 99.24%, 97.96% and 98.54% respectively, achieving non-invasive preoperative prediction, thereby capable of assisting doctors in clinical diagnosis and treatment. The GDI-PMNet method demonstrates high accuracy in predicting glioma molecular markers, addressing the limitations of current approaches by enhancing fine-grained feature extraction and prediction accuracy. This non-invasive preoperative prediction tool holds significant potential to assist clinicians in glioma diagnosis and treatment, ultimately improving patient outcomes.

MRI Classification Neurological Methodology In Silico Academic Lab

Filter Papers

Tags

Evaluating GPT-4o for emergency disposition of complex respiratory cases with pulmonology consultation: a diagnostic accuracy study.

Linearizing and forecasting: a reservoir computing route to digital twins of the brain

GFSR-Net: Guided Focus via Segment-Wise Relevance Network for Interpretable Deep Learning in Medical Imaging

Multimodal Foundation Models for Early Disease Detection

Deep Learning-Based CAD System for Enhanced Breast Lesion Classification and Grading Using RFTSDP Approach.

An interpretable hybrid deep learning framework for gastric cancer diagnosis using histopathological imaging.

Automated machine learning for prostate cancer detection and Gleason score prediction using T2WI: a diagnostic multi-center study.

Designing a web-based application for computer-aided diagnosis of intraosseous jaw lesions and assessment of its diagnostic accuracy.

Machine learning combined with CT-based radiomics predicts the prognosis of oesophageal squamous cell carcinoma.

Joint prediction of glioma molecular marker status based on GDI-PMNet.

Ready to Sharpen Your Edge?