Latest Papers on Radiology AI. Sources: medrxiv, Tags: In Silico.

Machine Learning-Based Meningioma Location Classification Using Vision Transformers and Transfer Learning

Bande, J. K., Johnson, E. T., Banderudrappagari, R.

•preprint•Sep 24 2025

PurposeIn this study, we aimed to use advanced machine learning (ML) techniques, specifically transfer learning and Vision Transformers (ViTs), to accurately classify meningioma in brain MRI scans. ViTs process images similarly to how humans visually perceive details and are useful for analyzing complex medical images. Transfer learning is a technique that uses models previously trained on large datasets and adapts them to specific use cases. Using transfer learning, this study aimed to enhance the diagnostic accuracy of meningioma location and demonstrate the capabilities the new technology. ApproachWe used a Google ViT model pre-trained on ImageNet-21k (a dataset with 14 million images and 21,843 classes) and fine-tuned on ImageNet 2012 (a dataset with 1 million images and 1,000 classes). Using this model, which was pre-trained and fine-tuned on large datasets of images, allowed us to leverage the predictive capabilities of the model trained on those large datasets without needing to train an entirely new model specific to only meningioma MRI scans. Transfer learning was used to adapt the pre-trained ViT to our specific use case, being meningioma location classification, using a dataset of 1,094 images of T1, contrast-enhanced, and T2-weighted MRI scans of meningiomas sorted according to location in the brain, with 11 different classes. ResultsThe final model trained and adapted on the meningioma MRI dataset achieved an average validation accuracy of 98.17% and a test accuracy of 89.95%. ConclusionsThis study demonstrates the potential of ViTs in meningioma location classification, leveraging their ability to analyze spatial relationships in medical images. While transfer learning enabled effective adaptation with limited data, class imbalance affected classification performance. Future work should focus on expanding datasets and incorporating ensemble learning to improve diagnostic reliability.

MRI Classification Neurological Methodology In Silico

Feature-Based Machine Learning for Brain Metastasis Detection Using Clinical MRI

Rahi, A., Shafiabadi, M. H.

•preprint•Sep 22 2025

Brain metastases represent one of the most common intracranial malignancies, yet early and accurate detection remains challenging, particularly in clinical datasets with limited availability of healthy controls. In this study, we developed a feature-based machine learning framework to classify patients with and without brain metastases using multi-modal clinical MRI scans. A dataset of 50 subjects from the UCSF Brain Metastases collection was analyzed, including pre- and post-contrast T1-weighted images and corresponding segmentation masks. We designed advanced feature extraction strategies capturing intensity, enhancement patterns, texture gradients, and histogram-based metrics, resulting in 44 quantitative descriptors per subject. To address the severe class imbalance (46 metastasis vs. 4 non-metastasis cases), we applied minority oversampling and noise-based augmentation, combined with stratified cross-validation. Among multiple classifiers, Random Forest consistently achieved the highest performance with an average accuracy of 96.7% and an area under the ROC curve (AUC) of 0.99 across five folds. The proposed approach highlights the potential of handcrafted radiomic-like features coupled with machine learning to improve metastasis detection in heterogeneous clinical MRI cohorts. These findings underscore the importance of methodological strategies for handling imbalanced data and support the integration of feature-based models as complementary tools for brain metastasis screening and research.

MRI Classification Neurological Methodology In Silico Academic Lab

Multimodal AI-driven Biomarker for Early Detection of Cancer Cachexia

Ahmed, S., Parker, N., Park, M., Davis, E. W., Jeong, D., Permuth, J. B., Schabath, M. B., Yilmaz, Y., Rasool, G.

•preprint•Sep 19 2025

Cancer cachexia, a multifactorial metabolic syndrome characterized by severe muscle wasting and weight loss, contributes to poor outcomes across various cancer types but lacks a standardized, generalizable biomarker for early detection. We present a multimodal AI-based biomarker trained on real-world clinical, radiologic, laboratory, and unstructured clinical note data, leveraging foundation models and large language models (LLMs) to identify cachexia at the time of cancer diagnosis. Prediction accuracy improved with each added modality: 77% using clinical variables alone, 81% with added laboratory data, and 85% with structured symptom features extracted from clinical notes. Incorporating embeddings from clinical text and CT images further improved accuracy to 92%. The framework also demonstrated prognostic utility, improving survival prediction as data modalities were integrated. Designed for real-world clinical deployment, the framework accommodates missing modalities without requiring imputation or case exclusion, supporting scalability across diverse oncology settings. Unlike prior models trained on curated datasets, our approach utilizes standard-of-care clinical data, facilitating integration into oncology workflows. In contrast to fixed-threshold composite indices such as the cachexia index (CXI), the model generates patient-specific predictions, enabling adaptable, cancer-agnostic performance. To enhance clinical reliability and safety, the framework incorporates uncertainty estimation to flag low-confidence cases for expert review. This work advances a clinically applicable, scalable, and trustworthy AI-driven decision support tool for early cachexia detection and personalized oncology care.

CT Classification Whole Body Retrospective Clinical In Silico Academic Lab GenAI

Deep learning-based prediction of cardiopulmonary disease in retinal images of premature infants

Singh, P., Kumar, S., Tyagi, R., Young, B. K., Jordan, B. K., Scottoline, B., Evers, P. D., Ostmo, S., Coyner, A. S., Lin, W.-C., Gupta, A., Erdogmus, D., Chan, R. V. P., McCourt, E. A., Barry, J. S., McEvoy, C. T., Chiang, M. F., Campbell, J. P., Kalpathy-Cramer, J.

•preprint•Sep 19 2025

ImportanceBronchopulmonary dysplasia (BPD) and pulmonary hypertension (PH) are leading causes of morbidity and mortality in premature infants. ObjectiveTo determine whether images obtained as part of retinopathy of prematurity (ROP) screening might contain features associated with BPD and PH in infants, and whether a multi-modal model integrating imaging features with demographic risk factors might outperform a model based on demographic risk alone. DesignA deep learning model was used to study retinal images collected from patients enrolled in the multi-institutional Imaging and Informatics in Retinopathy of Prematurity (i-ROP) study. SettingSeven neonatal intensive care units. Participants493 infants at risk for ROP undergoing routine ROP screening examinations from 2012 to 2020. Images were limited to <=34 weeks post-menstrual age (PMA) so as to precede the clinical diagnosis of BPD or PH. ExposureBPD was diagnosed by the presence of an oxygen requirement at 36 weeks PMA, and PH was diagnosed by echocardiogram at 34 weeks. A support vector machine model was trained to predict BPD, or PH, diagnosis using: A) image features alone (extracted using Resnet18), B) demographics alone, C) image features concatenated with demographics. To reduce the possibility of confounding with ROP, secondary models were trained using only images without clinical signs of ROP. Main Outcome MeasureFor both BPD and PH, we report performance on a held-out testset (99 patients from the BPD cohort and 37 patients from the PH cohort), assessed by the area under receiver operating characteristic curve. ResultsFor BPD, the diagnostic accuracy of a multimodal model was 0.82 (95% CI: 0.72-0.90), compared to demographics 0.72 (0.60-0.82; P=0.07) or imaging 0.72 (0.61-0.82; P=0.002) alone. For PH, it was 0.91 (0.71-1.0) combined compared to 0.68 (0.43-0.9; P=0.04) for demographics and 0.91 (0.78-1.0; P=0.4) for imaging alone. These associations remained even when models were trained on the subset of images without any clinical signs of ROP. Conclusions and RelevanceRetinal images obtained during ROP screening can be used to predict the diagnosis of BPD and PH in preterm infants, which may lead to earlier diagnosis and avoid the need for invasive diagnostic testing in the future. KEY POINTSO_ST_ABSQuestionC_ST_ABSCan an artificial intelligence (AI) algorithm diagnose bronchopulmonary dysplasia (BPD) or pulmonary hypertension (PH) in retinal images in preterm infants obtained during retinopathy of prematurity (ROP) screening examinations? FindingsAI was able to predict the presence of both BPD and PH in retinal images with higher accuracy than what could be predicted based on baseline demographic risk alone. MeaningDeploying AI models using images obtained during retinopathy of prematurity screening could lead to earlier diagnosis and avoid the need for more invasive diagnostic testing.

OCT Classification Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Bayesian machine learning enables discovery of risk factors for hepatosplenic multimorbidity related to schistosomiasis

Zhi, Y.-C., Anguajibi, V., Oryema, J. B., Nabatte, B., Opio, C. K., Kabatereine, N. B., Chami, G. F.

•preprint•Sep 19 2025

One in 25 deaths worldwide is related to liver disease, and often with multiple hepatosplenic conditions. Yet, little is understood of the risk factors for hepatosplenic multimorbidity, especially in the context of chronic infections. We present a novel Bayesian multitask learning framework to jointly model 45 hepatosplenic conditions assessed using point-of-care B-mode ultrasound for 3155 individuals aged 5-91 years within the SchistoTrack cohort across rural Uganda where chronic intestinal schistosomiasis is endemic. We identified distinct and shared biomedical, socioeconomic, and spatial risk factors for individual conditions and hepatosplenic multimorbidity, and introduced methods for measuring condition dependencies as risk factors. Notably, for gastro-oesophageal varices, we discovered key risk factors of older age, lower hemoglobin concentration, and severe schistosomal liver fibrosis. Our findings provide a compendium of risk factors to inform surveillance, triage, and follow-up, while our model enables improved prediction of hepatosplenic multimorbidity, and if validated on other systems, general multimorbidity.

Ultrasound Classification Abdominal Retrospective Clinical In Silico Academic Lab GenAI

Leveraging transfer learning from Acute Lymphoblastic Leukemia (ALL) pretraining to enhance Acute Myeloid Leukemia (AML) prediction

Duraiswamy, A., Harris-Birtill, D.

•preprint•Sep 19 2025

We overcome current limitations in Acute Myeloid Leukemia (AML) diagnosis by leveraging a transfer learning approach from Acute Lymphoblastic Leukemia (ALL) classification models, thus addressing the urgent need for more accurate and accessible AML diagnostic tools. AML has poorer prognosis than ALL, with a 5-year relative survival rate of only 17-19% compared to ALL survival rates of up to 75%, making early and accurate detection of AML paramount. Current diagnostic methods, rely heavily on manual microscopic examination, and are often subjective, time-consuming, and can suffer from inter-observer variability. While machine learning has shown promise in cancer classification, its application to AML detection, particularly leveraging the potential of transfer learning from related cancers like Acute Lymphoblastic Leukemia (ALL), remains underexplored. A comprehensive review of state-of-the-art advancements in acute lymphoblastic leukemia (ALL) and acute myeloid leukemia (AML) classification using deep learning algorithms is undertaken and key approaches are evaluated. The insights gained from this review inform the development of two novel machine learning pipelines designed to benchmark effectiveness of proposed transfer learning approaches. Five pre-trained models are fine-tuned using ALL training data (a novel approach in this context) to optimize their potential for AML classification. The result was the development of a best-in-class (BIC) model that surpasses current state-of-the-art (SOTA) performance in AML classification, advancing the accuracy of machine learning (ML)-driven cancer diagnostics. Author summaryAcute Myeloid Leukemia (AML) is an aggressive cancer with a poor prognosis. Early and accurate diagnosis is critical, but current methods are often subjective and time-consuming. We wanted to create a more accurate diagnostic tool by applying a technique called transfer learning from a similar cancer, Acute Lymphoblastic Leukemia (ALL). Two machine learning pipelines were developed. The first trained five different models on a large AML dataset to establish a baseline. The second pipeline first trained these models on an ALL dataset to "learn" from it before fine-tuning them on the AML data. Our experiments showed that the models that underwent transfer learning process consistently performed better than the models trained on AML data alone. The MobileNetV2 model, in particular, was the best-in-class, outperforming all other models and surpassing the best-reported metrics for AML classification in current literature. Our research demonstrates that transfer learning can enable highly accurate AML diagnostic models. The best-in-class model could potentially be used as a AML diagnostic tool, helping clinicians make faster and more accurate diagnoses, improving patient outcomes.

Mixed Modality Classification Methodology In Silico Benchmark SOTA

Compartment-specific Fat Distribution Profiles have Distinct Relationships with Cardiovascular Ageing and Future Cardiovascular Events

Maldonado-Garcia, C., Salih, A., Neubauer, S., Petersen, S. E., Raisi-Estabragh, Z.

•preprint•Sep 18 2025

Obesity is a global public health priority and a major risk factor for cardiovascular disease (CVD). Emerging evidence indicates variation in pathologic consequences of obesity deposition across different body compartments. Biological heart age may be estimated from imaging measures of cardiac structure and function and captures risk beyond traditional measures. Using cardiac and abdominal magnetic resonance imaging (MRI) from 34,496 UK Biobank participants and linked health record data, we investigated how compartment-specific obesity phenotypes relate to cardiac ageing and incident CVD risk. Biological heart age was estimated using machine learning from 56 cardiac MRI phenotypes. K-means clustering of abdominal visceral (VAT), abdominal subcutaneous (ASAT), and pericardial (PAT) adiposity identified a high-risk cluster (characterised by greater adiposity across all three depots) associated with accelerated cardiac ageing - and a lower-risk cluster linked to decelerated ageing. These clusters provided more precise stratification of cardiovascular ageing trajectories than established body mass index categories. Mediation analysis showed that VAT and PAT explained 13.7% and 11.9% of obesity-associated CVD risk, respectively, whereas ASAT contributed minimally, with effects more pronounced in males. Thus, cardiovascular risk appears to be driven primarily by visceral and pericardial rather than subcutaneous fat. Our findings reveal a distinct risk profile of compartment-specific fat distributions and show the importance of pericardial and visceral fat as drivers of greater cardiovascular ageing. Advanced image-defined adiposity profiling may enhance CVD risk prediction beyond anthropometric measures and enhance mechanistic understanding.

MRI Classification Cardiac Retrospective Clinical In Silico Academic Lab

Artificial Intelligence in Cardiac Amyloidosis: A Systematic Review and Meta-Analysis of Diagnostic Accuracy Across Imaging and Non-Imaging Modalities

Kumbalath, R. M., Challa, D., Patel, M. K., Prajapati, S. D., Kumari, K., mehan, A., Chopra, R., Somegowda, Y. M., Khan, R., Ramteke, H. D., juneja, M.

•preprint•Sep 18 2025

IntroductionCardiac amyloidosis (CA) is an underdiagnosed infiltrative cardiomyopathy associated with poor outcomes if not detected early. Artificial intelligence (AI) has emerged as a promising adjunct to conventional diagnostics, leveraging imaging and non-imaging data to improve recognition of CA. However, evidence on the comparative diagnostic performance of AI across modalities remains fragmented. This meta-analysis aimed to synthesize and quantify the diagnostic performance of AI models in CA across multiple modalities. MethodsA systematic literature search was conducted in PubMed, Embase, Web of Science, and Cochrane Library from inception to August 2025. Only published observational studies applying AI to the diagnosis of CA were included. Data were extracted on patient demographics, AI algorithms, modalities, and diagnostic performance metrics. Risk of bias was assessed using QUADAS-2, and certainty of evidence was graded using GRADE. Random-effects meta-analysis (REML) was performed to pool accuracy, precision, recall, F1-score, and area under the curve (AUC). ResultsFrom 115 screened studies, 25 observational studies met the inclusion criteria, encompassing a total of 589,877 patients with a male predominance (372,458 males, 63.2%; 221,818 females, 36.6%). A wide range of AI algorithms were applied, most notably convolutional neural networks (CNNs), which accounted for 526,879 patients, followed by 3D-ResNet architectures (56,872 patients), hybrid segmentation-classification networks (3,747), and smaller studies employing random forests (636), Res-CRNN (89), and traditional machine learning approaches (769). Data modalities included ECG (341,989 patients), echocardiography (>70,000 patients across multiple cohorts), scintigraphy ([~]24,000 patients), cardiac MRI ([~]900 patients), CT (299 patients), and blood tests (261 patients). Pooled diagnostic performance across all modalities demonstrated an overall accuracy of 84.0% (95% CI: 74.6-93.5), precision of 85.8% (95% CI: 79.6-92.0), recall (sensitivity) of 89.6% (95% CI: 85.7-93.4), and an F1-score of 87.2% (95% CI: 81.8-92.6). Area under the curve (AUC) analysis revealed modality-specific variation, with scintigraphy achieving the highest pooled AUC (99.7%), followed by MRI (96.8%), echocardiography (94.3%), blood tests (95.0%), CT (98.0%), and ECG (88.5%). Subgroup analysis confirmed significant differences between modalities (p < 0.001), with MRI and scintigraphy showing consistent high performance and low-to-moderate heterogeneity, while echocardiography displayed moderate accuracy but marked variability, and ECG demonstrated the lowest and most heterogeneous results. ConclusionAI demonstrates strong potential for improving CA diagnosis, with MRI and scintigraphy providing the most reliable performance, echocardiography offering an accessible but heterogeneous option, and ECG models remaining least consistent. While promising, future prospective multicenter studies are needed to validate AI models, improve subtype discrimination, and optimize multimodal integration for real-world clinical use.

Mixed Modality Classification Cardiac Meta Analysis In Silico Benchmark SOTA

Accuracy of Foundation AI Models for Hepatic Macrovesicular Steatosis Quantification in Frozen Sections

Koga, S., Guda, A., Wang, Y., Sahni, A., Wu, J., Rosen, A., Nield, J., Nandish, N., Patel, K., Goldman, H., Rajapakse, C., Walle, S., Kristen, S., Tondon, R., Alipour, Z.

•preprint•Sep 17 2025

IntroductionAccurate intraoperative assessment of macrovesicular steatosis in donor liver biopsies is critical for transplantation decisions but is often limited by inter-observer variability and freezing artifacts that can obscure histological details. Artificial intelligence (AI) offers a potential solution for standardized and reproducible evaluation. To evaluate the diagnostic performance of two self-supervised learning (SSL)-based foundation models, Prov-GigaPath and UNI, for classifying macrovesicular steatosis in frozen liver biopsy sections, compared with assessments by surgical pathologists. MethodsWe retrospectively analyzed 131 frozen liver biopsy specimens from 68 donors collected between November 2022 and September 2024. Slides were digitized into whole-slide images, tiled into patches, and used to extract embeddings with Prov-GigaPath and UNI; slide-level classifiers were then trained and tested. Intraoperative diagnoses by on-call surgical pathologists were compared with ground truth determined from independent reviews of permanent sections by two liver pathologists. Accuracy was evaluated for both five-category classification and a clinically significant binary threshold (<30% vs. [≥]30%). ResultsFor binary classification, Prov-GigaPath achieved 96.4% accuracy, UNI 85.7%, and surgical pathologists 84.0% (P = .22). In five-category classification, accuracies were lower: Prov-GigaPath 57.1%, UNI 50.0%, and pathologists 58.7% (P = .70). Misclassification primarily occurred in intermediate categories (5%-<30% steatosis). ConclusionsSSL-based foundation models performed comparably to surgical pathologists in classifying macrovesicular steatosis, at the clinically relevant <30% vs. [≥]30% threshold. These findings support the potential role of AI in standardizing intraoperative evaluation of donor liver biopsies; however, the small sample size limits generalizability and requires validation in larger, balanced cohorts.

Mixed Modality Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Normative Modelling of Brain Volume for Diagnostic and Prognostic Stratification in Multiple Sclerosis

Korbmacher, M., Lie, I. A., Wesnes, K., Westman, E., Espeseth, T., Andreassen, O., Westlye, L., Wergeland, S., Harbo, H. F., Nygaard, G. O., Myhr, K.-M., Hogestol, E. A., Torkildsen, O.

•preprint•Sep 15 2025

BackgroundBrain atrophy is a hallmark of multiple sclerosis (MS). For clinical translatability and individual-level predictions, brain atrophy needs to be put into context of the broader population, using reference or normative models. MethodsReference models of MRI-derived brain volumes were established from a large healthy control (HC) multi-cohort dataset (N=63 115, 51% females). The reference models were applied to two independent MS cohorts (N=362, T1w-scans=953, follow-up time up to 12 years) to assess deviations from the reference, defined as Z-values. We assessed the overlap of deviation profiles and their stability over time using individual-level transitions towards or out of significant reference deviation states (|Z|>1{middle dot}96). A negative binomial model was used for case-control comparisons of the number of extreme deviations. Linear models were used to assess differences in Z-score deviations between MS and propensity-matched HCs, and associations with clinical scores at baseline and over time. The utilized normative BrainReference models, scripts and usage instructions are freely available. FindingsWe identified a temporally stable, brain morphometric phenotype of MS. The right and left thalami most consistently showed significantly lower-than-reference volumes in MS (25% and 26% overlap across the sample). The number of such extreme smaller-than-reference values was 2{middle dot}70 in MS compared to HC (4{middle dot}51 versus 1{middle dot}67). Additional deviations indicated stronger disability (Expanded Disability Status Scale: {beta}=0{middle dot}22, 95% CI 0{middle dot}12 to 0{middle dot}32), Paced Auditory Serial Addition Test score ({beta}=-0{middle dot}27, 95% CI -0{middle dot}52 to -0{middle dot}02), and Fatigue Severity Score ({beta}=0{middle dot}29, 95% CI 0{middle dot}05 to 0{middle dot}53) at baseline, and over time with EDSS ({beta}=0{middle dot}07, 95% CI 0{middle dot}02 to 0{middle dot}13). We additionally provide detailed maps of reference-deviations and their associations with clinical assessments. InterpretationWe present a heterogenous brain phenotype of MS which is associated with clinical manifestations, and particularly implicating the thalamus. The findings offer potential to aid diagnosis and prognosis of MS. FundingNorwegian MS-union, Research Council of Norway (#223273; #324252); the South-Eastern Norway Regional Health Authority (#2022080); and the European Unions Horizon2020 Research and Innovation Programme (#847776, #802998). Research in contextO_ST_ABSEvidence before this studyC_ST_ABSReference values and normative models have yet to be widely applied to neuroimaging assessments of neurological disorders such as multiple sclerosis (MS). We conducted a literature search in PubMed and Embase (Jan 1, 2000-September 12, 2025) using the terms "MRI" AND "multiple sclerosis", with and without the keywords "normative model*" and "atrophy", without language restrictions. While normative models have been applied in psychiatric and developmental disorders, few studies have addressed their use in neurological conditions. Existing MS research has largely focused on global atrophy and has not provided regional reference charts or established links to clinical and cognitive outcomes. Added value of this studyWe provide regionally detailed brain morphometry maps derived from a heterogeneous MS cohort spanning wide ranges of age, sex, clinical phenotype, disease duration, disability, and scanner characteristics. By leveraging normative modelling, our approach enables individualised brain phenotyping of MS in relation to a population based normative sample. The analyses reveal clinically meaningful and spatially consistent patterns of smaller brain volumes, particularly in the thalamus and frontal cortical regions, which are linked to disability, cognitive impairment, and fatigue. Robustness across scanners, centres, and longitudinal follow-up supports the stability and generalisability of these findings to real-world MS populations. Implications of all the available evidenceNormative modelling offers an individualised, sensitive, and interpretable approach to quantifying brain structure in MS by providing individual-specific reference values, supporting earlier detection of neurodegeneration and improved patient stratification. A consistent pattern of thalamic and fronto-parietal deviations defines a distinct morphometric profile of MS, with potential utility for early and personalised diagnosis and disease monitoring in clinical practice and clinical trials.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab Open Code Open Dataset

Filter Papers

Tags

Machine Learning-Based Meningioma Location Classification Using Vision Transformers and Transfer Learning

Feature-Based Machine Learning for Brain Metastasis Detection Using Clinical MRI

Multimodal AI-driven Biomarker for Early Detection of Cancer Cachexia

Deep learning-based prediction of cardiopulmonary disease in retinal images of premature infants

Bayesian machine learning enables discovery of risk factors for hepatosplenic multimorbidity related to schistosomiasis

Leveraging transfer learning from Acute Lymphoblastic Leukemia (ALL) pretraining to enhance Acute Myeloid Leukemia (AML) prediction

Compartment-specific Fat Distribution Profiles have Distinct Relationships with Cardiovascular Ageing and Future Cardiovascular Events

Artificial Intelligence in Cardiac Amyloidosis: A Systematic Review and Meta-Analysis of Diagnostic Accuracy Across Imaging and Non-Imaging Modalities

Accuracy of Foundation AI Models for Hepatic Macrovesicular Steatosis Quantification in Frozen Sections

Normative Modelling of Brain Volume for Diagnostic and Prognostic Stratification in Multiple Sclerosis

Ready to Sharpen Your Edge?