Latest Papers on Radiology AI. Sources: medrxiv, Tags: Mixed Modality.

Leveraging transfer learning from Acute Lymphoblastic Leukemia (ALL) pretraining to enhance Acute Myeloid Leukemia (AML) prediction

Duraiswamy, A., Harris-Birtill, D.

•preprint•Sep 19 2025

We overcome current limitations in Acute Myeloid Leukemia (AML) diagnosis by leveraging a transfer learning approach from Acute Lymphoblastic Leukemia (ALL) classification models, thus addressing the urgent need for more accurate and accessible AML diagnostic tools. AML has poorer prognosis than ALL, with a 5-year relative survival rate of only 17-19% compared to ALL survival rates of up to 75%, making early and accurate detection of AML paramount. Current diagnostic methods, rely heavily on manual microscopic examination, and are often subjective, time-consuming, and can suffer from inter-observer variability. While machine learning has shown promise in cancer classification, its application to AML detection, particularly leveraging the potential of transfer learning from related cancers like Acute Lymphoblastic Leukemia (ALL), remains underexplored. A comprehensive review of state-of-the-art advancements in acute lymphoblastic leukemia (ALL) and acute myeloid leukemia (AML) classification using deep learning algorithms is undertaken and key approaches are evaluated. The insights gained from this review inform the development of two novel machine learning pipelines designed to benchmark effectiveness of proposed transfer learning approaches. Five pre-trained models are fine-tuned using ALL training data (a novel approach in this context) to optimize their potential for AML classification. The result was the development of a best-in-class (BIC) model that surpasses current state-of-the-art (SOTA) performance in AML classification, advancing the accuracy of machine learning (ML)-driven cancer diagnostics. Author summaryAcute Myeloid Leukemia (AML) is an aggressive cancer with a poor prognosis. Early and accurate diagnosis is critical, but current methods are often subjective and time-consuming. We wanted to create a more accurate diagnostic tool by applying a technique called transfer learning from a similar cancer, Acute Lymphoblastic Leukemia (ALL). Two machine learning pipelines were developed. The first trained five different models on a large AML dataset to establish a baseline. The second pipeline first trained these models on an ALL dataset to "learn" from it before fine-tuning them on the AML data. Our experiments showed that the models that underwent transfer learning process consistently performed better than the models trained on AML data alone. The MobileNetV2 model, in particular, was the best-in-class, outperforming all other models and surpassing the best-reported metrics for AML classification in current literature. Our research demonstrates that transfer learning can enable highly accurate AML diagnostic models. The best-in-class model could potentially be used as a AML diagnostic tool, helping clinicians make faster and more accurate diagnoses, improving patient outcomes.

Mixed Modality Classification Methodology In Silico Benchmark SOTA

Artificial Intelligence in Cardiac Amyloidosis: A Systematic Review and Meta-Analysis of Diagnostic Accuracy Across Imaging and Non-Imaging Modalities

Kumbalath, R. M., Challa, D., Patel, M. K., Prajapati, S. D., Kumari, K., mehan, A., Chopra, R., Somegowda, Y. M., Khan, R., Ramteke, H. D., juneja, M.

•preprint•Sep 18 2025

IntroductionCardiac amyloidosis (CA) is an underdiagnosed infiltrative cardiomyopathy associated with poor outcomes if not detected early. Artificial intelligence (AI) has emerged as a promising adjunct to conventional diagnostics, leveraging imaging and non-imaging data to improve recognition of CA. However, evidence on the comparative diagnostic performance of AI across modalities remains fragmented. This meta-analysis aimed to synthesize and quantify the diagnostic performance of AI models in CA across multiple modalities. MethodsA systematic literature search was conducted in PubMed, Embase, Web of Science, and Cochrane Library from inception to August 2025. Only published observational studies applying AI to the diagnosis of CA were included. Data were extracted on patient demographics, AI algorithms, modalities, and diagnostic performance metrics. Risk of bias was assessed using QUADAS-2, and certainty of evidence was graded using GRADE. Random-effects meta-analysis (REML) was performed to pool accuracy, precision, recall, F1-score, and area under the curve (AUC). ResultsFrom 115 screened studies, 25 observational studies met the inclusion criteria, encompassing a total of 589,877 patients with a male predominance (372,458 males, 63.2%; 221,818 females, 36.6%). A wide range of AI algorithms were applied, most notably convolutional neural networks (CNNs), which accounted for 526,879 patients, followed by 3D-ResNet architectures (56,872 patients), hybrid segmentation-classification networks (3,747), and smaller studies employing random forests (636), Res-CRNN (89), and traditional machine learning approaches (769). Data modalities included ECG (341,989 patients), echocardiography (>70,000 patients across multiple cohorts), scintigraphy ([~]24,000 patients), cardiac MRI ([~]900 patients), CT (299 patients), and blood tests (261 patients). Pooled diagnostic performance across all modalities demonstrated an overall accuracy of 84.0% (95% CI: 74.6-93.5), precision of 85.8% (95% CI: 79.6-92.0), recall (sensitivity) of 89.6% (95% CI: 85.7-93.4), and an F1-score of 87.2% (95% CI: 81.8-92.6). Area under the curve (AUC) analysis revealed modality-specific variation, with scintigraphy achieving the highest pooled AUC (99.7%), followed by MRI (96.8%), echocardiography (94.3%), blood tests (95.0%), CT (98.0%), and ECG (88.5%). Subgroup analysis confirmed significant differences between modalities (p < 0.001), with MRI and scintigraphy showing consistent high performance and low-to-moderate heterogeneity, while echocardiography displayed moderate accuracy but marked variability, and ECG demonstrated the lowest and most heterogeneous results. ConclusionAI demonstrates strong potential for improving CA diagnosis, with MRI and scintigraphy providing the most reliable performance, echocardiography offering an accessible but heterogeneous option, and ECG models remaining least consistent. While promising, future prospective multicenter studies are needed to validate AI models, improve subtype discrimination, and optimize multimodal integration for real-world clinical use.

Mixed Modality Classification Cardiac Meta Analysis In Silico Benchmark SOTA

Accuracy of Foundation AI Models for Hepatic Macrovesicular Steatosis Quantification in Frozen Sections

Koga, S., Guda, A., Wang, Y., Sahni, A., Wu, J., Rosen, A., Nield, J., Nandish, N., Patel, K., Goldman, H., Rajapakse, C., Walle, S., Kristen, S., Tondon, R., Alipour, Z.

•preprint•Sep 17 2025

IntroductionAccurate intraoperative assessment of macrovesicular steatosis in donor liver biopsies is critical for transplantation decisions but is often limited by inter-observer variability and freezing artifacts that can obscure histological details. Artificial intelligence (AI) offers a potential solution for standardized and reproducible evaluation. To evaluate the diagnostic performance of two self-supervised learning (SSL)-based foundation models, Prov-GigaPath and UNI, for classifying macrovesicular steatosis in frozen liver biopsy sections, compared with assessments by surgical pathologists. MethodsWe retrospectively analyzed 131 frozen liver biopsy specimens from 68 donors collected between November 2022 and September 2024. Slides were digitized into whole-slide images, tiled into patches, and used to extract embeddings with Prov-GigaPath and UNI; slide-level classifiers were then trained and tested. Intraoperative diagnoses by on-call surgical pathologists were compared with ground truth determined from independent reviews of permanent sections by two liver pathologists. Accuracy was evaluated for both five-category classification and a clinically significant binary threshold (<30% vs. [≥]30%). ResultsFor binary classification, Prov-GigaPath achieved 96.4% accuracy, UNI 85.7%, and surgical pathologists 84.0% (P = .22). In five-category classification, accuracies were lower: Prov-GigaPath 57.1%, UNI 50.0%, and pathologists 58.7% (P = .70). Misclassification primarily occurred in intermediate categories (5%-<30% steatosis). ConclusionsSSL-based foundation models performed comparably to surgical pathologists in classifying macrovesicular steatosis, at the clinically relevant <30% vs. [≥]30% threshold. These findings support the potential role of AI in standardizing intraoperative evaluation of donor liver biopsies; however, the small sample size limits generalizability and requires validation in larger, balanced cohorts.

Mixed Modality Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Predicting Rejection Risk in Heart Transplantation: An Integrated Clinical-Histopathologic Framework for Personalized Post-Transplant Care

Kim, D. D., Madabhushi, A., Margulies, K. B., Peyster, E. G.

•preprint•Sep 8 2025

BackgroundCardiac allograft rejection (CAR) remains the leading cause of early graft failure after heart transplantation (HT). Current diagnostics, including histologic grading of endomyocardial biopsy (EMB) and blood-based assays, lack accurate predictive power for future CAR risk. We developed a predictive model integrating routine clinical data with quantitative morphologic features extracted from routine EMBs to demonstrate the precision-medicine potential of mining existing data sources in post-HT care. MethodsIn a retrospective cohort of 484 HT recipients with 1,188 EMB encounters within 6 months post-transplant, we extracted 370 quantitative pathology features describing lymphocyte infiltration and stromal architecture from digitized H&E-stained slides. Longitudinal clinical data comprising 268 variables--including lab values, immunosuppression records, and prior rejection history--were aggregated per patient. Using the XGBoost algorithm with rigorous cross-validation, we compared models based on four different data sources: clinical-only, morphology-only, cross-sectional-only, and fully integrated longitudinal data. The top predictors informed the derivation of a simplified Integrated Rejection Risk Index (IRRI), which relies on just 4 clinical and 4 morphology risk facts. Model performance was evaluated by AUROC, AUPRC, and time-to-event hazard ratios. ResultsThe fully integrated longitudinal model achieved superior predictive accuracy (AUROC 0.86, AUPRC 0.74). IRRI stratified patients into risk categories with distinct future CAR hazards: high-risk patients showed a markedly increased CAR risk (HR=6.15, 95% CI: 4.17-9.09), while low-risk patients had significantly reduced risk (HR=0.52, 95% CI: 0.33-0.84). This performance exceeded models based on just cross-sectional or single-domain data, demonstrating the value of multi-modal, temporal data integration. ConclusionsBy integrating longitudinal clinical and biopsy morphologic features, IRRI provides a scalable, interpretable tool for proactive CAR risk assessment. This precision-based approach could support risk-adaptive surveillance and immunosuppression management strategies, offering a promising pathway toward safer, more personalized post-HT care with the potential to reduce unnecessary procedures and improve outcomes. Clinical PerspectiveWhat is new? O_LICurrent tools for cardiac allograft monitoring detect rejection only after it occurs and are not designed to forecast future risk. This leads to missed opportunities for early intervention, avoidable patient injury, unnecessary testing, and inefficiencies in care. C_LIO_LIWe developed a machine learning-based risk index that integrates clinical features, quantitative biopsy morphology, and longitudinal temporal trends to create a robust predictive framework. C_LIO_LIThe Integrated Rejection Risk Index (IRRI) provides highly accurate prediction of future allograft rejection, identifying both high- and low-risk patients up to 90 days in advance - a capability entirely absent from current transplant management. C_LI What are the clinical implications? O_LIIntegrating quantitative histopathology with clinical data provides a more precise, individualized estimate of rejection risk in heart transplant recipients. C_LIO_LIThis framework has the potential to guide post-transplant surveillance intensity, immunosuppressive management, and patient counseling. C_LIO_LIAutomated biopsy analysis could be incorporated into digital pathology workflows, enabling scalable, multicenter application in real-world transplant care. C_LI

Mixed Modality Classification Cardiac Retrospective Clinical In Silico Academic Lab GenAI

Decoding Fibrosis: Transcriptomic and Clinical Insights via AI-Derived Collagen Deposition Phenotypes in MASLD

Wojciechowska, M. K., Thing, M., Hu, Y., Mazzoni, G., Harder, L. M., Werge, M. P., Kimer, N., Das, V., Moreno Martinez, J., Prada-Medina, C. A., Vyberg, M., Goldin, R., Serizawa, R., Tomlinson, J., Douglas Gaalsgard, E., Woodcock, D. J., Hvid, H., Pfister, D. R., Jurtz, V. I., Gluud, L.-L., Rittscher, J.

•preprint•Sep 2 2025

Histological assessment is foundational to multi-omics studies of liver disease, yet conventional fibrosis staging lacks resolution, and quantitative metrics like collagen proportionate area (CPA) fail to capture tissue architecture. While recent AI-driven approaches offer improved precision, they are proprietary and not accessible to academic research. Here, we present a novel, interpretable AI-based framework for characterising liver fibrosis from picrosirius red (PSR)-stained slides. By identifying distinct data-driven collagen deposition phenotypes (CDPs) which capture distinct morphologies, our method substantially improves the sensitivity and specificity of downstream transcriptomic and proteomic analyses compared to CPA and traditional fibrosis scores. Pathway analysis reveals that CDPs 4 and 5 are associated with active extracellular matrix remodelling, while phenotype correlates highlight links to liver functional status. Importantly, we demonstrate that selected CDPs can predict clinical outcomes with similar accuracy to established fibrosis metrics. All models and tools are made freely available to support transparent and reproducible multi-omics pathology research. HighlightsO_LIWe present a set of data-driven collagen deposition phenotypes for analysing PSR-stained liver biopsies, offering a spatially informed alternative to conventional fibrosis staging and CPA available as open-source code. C_LIO_LIThe identified collagen deposition phenotypes enhance transcriptomic and proteomic signal detection, revealing active ECM remodelling and distinct functional tissue states. C_LIO_LISelected phenotypes predict clinical outcomes with performance comparable to fibrosis stage and CPA, highlighting their potential as candidate quantitative indicators of fibrosis severity. C_LI O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=98 SRC="FIGDIR/small/25334719v1_ufig1.gif" ALT="Figure 1"> View larger version (22K): [email protected]@1793532org.highwire.dtl.DTLVardef@93a0d8org.highwire.dtl.DTLVardef@24d289_HPS_FORMAT_FIGEXP M_FIG C_FIG

Mixed Modality Segmentation Abdominal Methodology In Silico Academic Lab Open Code Open Dataset

The African Breast Imaging Dataset for Equitable Cancer Care: Protocol for an Open Mammogram and Ultrasound Breast Cancer Detection Dataset

Musinguzi, D., Katumba, A., Kawooya, M. G., Malumba, R., Nakatumba-Nabende, J., Achuka, S. A., Adewole, M., Anazodo, U.

•preprint•Aug 28 2025

IntroductionBreast cancer is one of the most common cancers globally. Its incidence in Africa has increased sharply, surpassing that in high-income countries. Mortality remains high due to late-stage diagnosis, when treatment is less effetive. We propose the first open, longitudinal breast imaging dataset from Africa comprising point-of-care ultrasound scans, mammograms, biopsy pathology, and clinical profiles to support early detection using machine learning. Methods and AnalysisWe will engage women through community outreach and train them in self-examination. Those with suspected lesions, particularly with a family history of breast cancer, will be invited to participate. A total of 100 women will undergo baseline assessment at medical centers, including clinical exams, blood tests, and mammograms. Follow-up point-of-care ultrasound scans and clinical data will be collected at 3 and 6 months, with final assessments at 9 months including mammograms. Ethics and DisseminationThe study has been approved by the Institutional Review Boards at ECUREI and the MAI Lab. Findings will be disseminated through peer-reviewed journals and scientific conferences.

Mixed Modality Detection Breast Dataset Release Concept Academic Lab Open Dataset

HONeYBEE: Enabling Scalable Multimodal AI in Oncology Through Foundation Model-Driven Embeddings

Tripathi, A. G., Waqas, A., Schabath, M. B., Yilmaz, Y., Rasool, G.

•preprint•Aug 27 2025

HONeYBEE (Harmonized ONcologY Biomedical Embedding Encoder) is an open-source framework that integrates multimodal biomedical data for oncology applications. It processes clinical data (structured and unstructured), whole-slide images, radiology scans, and molecular profiles to generate unified patient-level embeddings using domain-specific foundation models and fusion strategies. These embeddings enable survival prediction, cancer-type classification, patient similarity retrieval, and cohort clustering. Evaluated on 11,400+ patients across 33 cancer types from The Cancer Genome Atlas (TCGA), clinical embeddings showed the strongest single-modality performance with 98.5% classification accuracy and 96.4% precision@10 in patient retrieval. They also achieved the highest survival prediction concordance indices across most cancer types. Multimodal fusion provided complementary benefits for specific cancers, improving overall survival prediction beyond clinical features alone. Comparative evaluation of four large language models revealed that general-purpose models like Qwen3 outperformed specialized medical models for clinical text representation, though task-specific fine-tuning improved performance on heterogeneous data such as pathology reports.

Mixed Modality Classification Methodology In Silico Open Source Open Code GenAI

Deep learning-based identification of necrosis and microvascular proliferation in adult diffuse gliomas from whole-slide images

Guo, Y., Huang, H., Liu, X., Zou, W., Qiu, F., Liu, Y., Chai, R., Jiang, T., Wang, J.

•preprint•Aug 16 2025

For adult diffuse gliomas (ADGs), most grading can be achieved through molecular subtyping, retaining only two key histopathological features for high-grade glioma (HGG): necrosis (NEC) and microvascular proliferation (MVP). We developed a deep learning (DL) framework to automatically identify and characterize these features. We trained patch-level models to detect and quantify NEC and MVP using a dataset that employed active learning, incorporating patches from 621 whole-slide images (WSIs) from the Chinese Glioma Genome Atlas (CGGA). Utilizing trained patch-level models, we effectively integrated the predicted outcomes and positions of individual patches within WSIs from The Cancer Genome Atlas (TCGA) cohort to form datasets. Subsequently, we introduced a patient-level model, named PLNet (Probability Localization Network), which was trained on these datasets to facilitate patient diagnosis. We also explored the subtypes of NEC and MVP based on the features extracted from patch-level models with clustering process applied on all positive patches. The patient-level models demonstrated exceptional performance, achieving an AUC of 0.9968, 0.9995 and AUPRC of 0.9788, 0.9860 for NEC and MVP, respectively. Compared to pathological reports, our patient-level models achieved the accuracy of 88.05% for NEC and 90.20% for MVP, along with a sensitivity of 73.68% and 77%. When sensitivity was set at 80%, the accuracy for NEC reached 79.28% and for MVP reached 77.55%. DL models enabled more efficient and accurate histopathological image analysis which will aid traditional glioma diagnosis. Clustering-based analyses utilizing features extracted from patch-level models could further investigate the subtypes of NEC and MVP.

Mixed Modality Detection Neurological Methodology In Silico Academic Lab Benchmark SOTA

A Case Study on Colposcopy-Based Cervical Cancer Staging Reveals an Alarming Lack of Data Sharing Hindering the Adoption of Machine Learning in Clinical Practice

Schulz, M., Leha, A.

•preprint•Aug 15 2025

BackgroundThe inbuilt ability to adapt existing models to new applications has been one of the key drivers of the success of deep learning models. Thereby, sharing trained models is crucial for their adaptation to different populations and domains. Not sharing models prohibits validation and potentially following translation into clinical practice, and hinders scientific progress. In this paper we examine the current state of data and model sharing in the medical field using cervical cancer staging on colposcopy images as a case example. MethodsWe conducted a comprehensive literature search in PubMed to identify studies employing machine learning techniques in the analysis of colposcopy images. For studies where raw data was not directly accessible, we systematically inquired about accessing the pre-trained model weights and/or raw colposcopy image data by contacting the authors using various channels. ResultsWe included 46 studies and one publicly available dataset in our study. We retrieved data of the latter and inquired about data access for the 46 studies by contacting a total of 92 authors. We received 15 responses related to 14 studies (30%). The remaining 32 studies remained unresponsive (70%). Of the 15 responses received, two responses redirected our inquiry to other authors, two responses were initially pending, and 11 declined data sharing. Despite our follow-up efforts on all responses received, none of the inquiries led to actual data sharing (0%). The only available data source remained the publicly available dataset. ConclusionsDespite the long-standing demands for reproducible research and efforts to incentivize data sharing, such as the requirement of data availability statements, our case study reveals a persistent lack of data sharing culture. Reasons identified in this case study include a lack of resources to provide the data, data privacy concerns, ongoing trial registrations and low response rates to inquiries. Potential routes for improvement could include comprehensive data availability statements required by journals, data preparation and deposition in a repository as part of the publication process, an automatic maximal embargo time after which data will become openly accessible and data sharing rules set by funders.

Mixed Modality Classification Abdominal Review In Silico Academic Lab Reproducibility

Multi-organ AI Endophenotypes Chart the Heterogeneity of Pan-disease in the Brain, Eye, and Heart

Consortium, T. M., Boquet-Pujadas, A., anagnostakis, f., Yang, Z., Tian, Y. E., duggan, m., erus, g., srinivasan, d., Joynes, C., Bai, W., patel, p., Walker, K. A., Zalesky, A., davatzikos, c., WEN, J.

•preprint•Aug 13 2025

Disease heterogeneity and commonality pose significant challenges to precision medicine, as traditional approaches frequently focus on single disease entities and overlook shared mechanisms across conditions1. Inspired by pan-cancer2 and multi-organ research3, we introduce the concept of "pan-disease" to investigate the heterogeneity and shared etiology in brain, eye, and heart diseases. Leveraging individual-level data from 129,340 participants, as well as summary-level data from the MULTI consortium, we applied a weakly-supervised deep learning model (Surreal-GAN4,5) to multi-organ imaging, genetic, proteomic, and RNA-seq data, identifying 11 AI-derived biomarkers - called Multi-organ AI Endophenotypes (MAEs) - for the brain (Brain 1-6), eye (Eye 1-3), and heart (Heart 1-2), respectively. We found Brain 3 to be a risk factor for Alzheimers disease (AD) progression and mortality, whereas Brain 5 was protective against AD progression. Crucially, in data from an anti-amyloid AD drug (solanezumab6), heterogeneity in cognitive decline trajectories was observed across treatment groups. At week 240, patients with lower brain 1-3 expression had slower cognitive decline, whereas patients with higher expression had faster cognitive decline. A multi-layer causal pathway pinpointed Brain 1 as a mediational endophenotype7 linking the FLRT2 protein to migraine, exemplifying novel therapeutic targets and pathways. Additionally, genes associated with Eye 1 and Eye 3 were enriched in cancer drug-related gene sets with causal links to specific cancer types and proteins. Finally, Heart 1 and Heart 2 had the highest mortality risk and unique medication history profiles, with Heart 1 showing favorable responses to antihypertensive medications and Heart 2 to digoxin treatment. The 11 MAEs provide novel AI dimensional representations for precision medicine and highlight the potential of AI-driven patient stratification for disease risk monitoring, clinical trials, and drug discovery.

Mixed Modality Classification Whole Body Retrospective Clinical In Silico Consortium GenAI

Filter Papers

Tags