Latest Papers on Radiology AI. Sources: medrxiv, Tags: Whole Body.

Multimodal AI-driven Biomarker for Early Detection of Cancer Cachexia

Ahmed, S., Parker, N., Park, M., Davis, E. W., Jeong, D., Permuth, J. B., Schabath, M. B., Yilmaz, Y., Rasool, G.

•preprint•Sep 19 2025

Cancer cachexia, a multifactorial metabolic syndrome characterized by severe muscle wasting and weight loss, contributes to poor outcomes across various cancer types but lacks a standardized, generalizable biomarker for early detection. We present a multimodal AI-based biomarker trained on real-world clinical, radiologic, laboratory, and unstructured clinical note data, leveraging foundation models and large language models (LLMs) to identify cachexia at the time of cancer diagnosis. Prediction accuracy improved with each added modality: 77% using clinical variables alone, 81% with added laboratory data, and 85% with structured symptom features extracted from clinical notes. Incorporating embeddings from clinical text and CT images further improved accuracy to 92%. The framework also demonstrated prognostic utility, improving survival prediction as data modalities were integrated. Designed for real-world clinical deployment, the framework accommodates missing modalities without requiring imputation or case exclusion, supporting scalability across diverse oncology settings. Unlike prior models trained on curated datasets, our approach utilizes standard-of-care clinical data, facilitating integration into oncology workflows. In contrast to fixed-threshold composite indices such as the cachexia index (CXI), the model generates patient-specific predictions, enabling adaptable, cancer-agnostic performance. To enhance clinical reliability and safety, the framework incorporates uncertainty estimation to flag low-confidence cases for expert review. This work advances a clinically applicable, scalable, and trustworthy AI-driven decision support tool for early cachexia detection and personalized oncology care.

CT Classification Whole Body Retrospective Clinical In Silico Academic Lab GenAI

Multi-organ AI Endophenotypes Chart the Heterogeneity of Pan-disease in the Brain, Eye, and Heart

Consortium, T. M., Boquet-Pujadas, A., anagnostakis, f., Yang, Z., Tian, Y. E., duggan, m., erus, g., srinivasan, d., Joynes, C., Bai, W., patel, p., Walker, K. A., Zalesky, A., davatzikos, c., WEN, J.

•preprint•Aug 13 2025

Disease heterogeneity and commonality pose significant challenges to precision medicine, as traditional approaches frequently focus on single disease entities and overlook shared mechanisms across conditions1. Inspired by pan-cancer2 and multi-organ research3, we introduce the concept of "pan-disease" to investigate the heterogeneity and shared etiology in brain, eye, and heart diseases. Leveraging individual-level data from 129,340 participants, as well as summary-level data from the MULTI consortium, we applied a weakly-supervised deep learning model (Surreal-GAN4,5) to multi-organ imaging, genetic, proteomic, and RNA-seq data, identifying 11 AI-derived biomarkers - called Multi-organ AI Endophenotypes (MAEs) - for the brain (Brain 1-6), eye (Eye 1-3), and heart (Heart 1-2), respectively. We found Brain 3 to be a risk factor for Alzheimers disease (AD) progression and mortality, whereas Brain 5 was protective against AD progression. Crucially, in data from an anti-amyloid AD drug (solanezumab6), heterogeneity in cognitive decline trajectories was observed across treatment groups. At week 240, patients with lower brain 1-3 expression had slower cognitive decline, whereas patients with higher expression had faster cognitive decline. A multi-layer causal pathway pinpointed Brain 1 as a mediational endophenotype7 linking the FLRT2 protein to migraine, exemplifying novel therapeutic targets and pathways. Additionally, genes associated with Eye 1 and Eye 3 were enriched in cancer drug-related gene sets with causal links to specific cancer types and proteins. Finally, Heart 1 and Heart 2 had the highest mortality risk and unique medication history profiles, with Heart 1 showing favorable responses to antihypertensive medications and Heart 2 to digoxin treatment. The 11 MAEs provide novel AI dimensional representations for precision medicine and highlight the potential of AI-driven patient stratification for disease risk monitoring, clinical trials, and drug discovery.

Mixed Modality Classification Whole Body Retrospective Clinical In Silico Consortium GenAI

Assessing accuracy and legitimacy of multimodal large language models on Japan Diagnostic Radiology Board Examination

Hirano, Y., Miki, S., Yamagishi, Y., Hanaoka, S., Nakao, T., Kikuchi, T., Nakamura, Y., Nomura, Y., Yoshikawa, T., Abe, O.

•preprint•Jun 23 2025

PurposeTo assess and compare the accuracy and legitimacy of multimodal large language models (LLMs) on the Japan Diagnostic Radiology Board Examination (JDRBE). Materials and methodsThe dataset comprised questions from JDRBE 2021, 2023, and 2024, with ground-truth answers established through consensus among multiple board-certified diagnostic radiologists. Questions without associated images and those lacking unanimous agreement on answers were excluded. Eight LLMs were evaluated: GPT-4 Turbo, GPT-4o, GPT-4.5, GPT-4.1, o3, o4-mini, Claude 3.7 Sonnet, and Gemini 2.5 Pro. Each model was evaluated under two conditions: with inputting images (vision) and without (text-only). Performance differences between the conditions were assessed using McNemars exact test. Two diagnostic radiologists (with 2 and 18 years of experience) independently rated the legitimacy of responses from four models (GPT-4 Turbo, Claude 3.7 Sonnet, o3, and Gemini 2.5 Pro) using a five-point Likert scale, blinded to model identity. Legitimacy scores were analyzed using Friedmans test, followed by pairwise Wilcoxon signed-rank tests with Holm correction. ResultsThe dataset included 233 questions. Under the vision condition, o3 achieved the highest accuracy at 72%, followed by o4-mini (70%) and Gemini 2.5 Pro (70%). Under the text-only condition, o3 topped the list with an accuracy of 67%. Addition of image input significantly improved the accuracy of two models (Gemini 2.5 Pro and GPT-4.5), but not the others. Both o3 and Gemini 2.5 Pro received significantly higher legitimacy scores than GPT-4 Turbo and Claude 3.7 Sonnet from both raters. ConclusionRecent multimodal LLMs, particularly o3 and Gemini 2.5 Pro, have demonstrated remarkable progress on JDRBE questions, reflecting their rapid evolution in diagnostic radiology. Secondary abstract Eight multimodal large language models were evaluated on the Japan Diagnostic Radiology Board Examination. OpenAIs o3 and Google DeepMinds Gemini 2.5 Pro achieved high accuracy rates (72% and 70%) and received good legitimacy scores from human raters, demonstrating steady progress.

Mixed Modality LLM Radiology Report Whole Body Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Deep learning-enabled MRI phenotyping uncovers regional body composition heterogeneity and disease associations in two European population cohorts

Mertens, C. J., Haentze, H., Ziegelmayer, S., Kather, J. N., Truhn, D., Kim, S. H., Busch, F., Weller, D., Wiestler, B., Graf, M., Bamberg, F., Schlett, C. L., Weiss, J. B., Ringhof, S., Can, E., Schulz-Menger, J., Niendorf, T., Lammert, J., Molwitz, I., Kader, A., Hering, A., Meddeb, A., Nawabi, J., Schulze, M. B., Keil, T., Willich, S. N., Krist, L., Hadamitzky, M., Hannemann, A., Bassermann, F., Rueckert, D., Pischon, T., Hapfelmeier, A., Makowski, M. R., Bressem, K. K., Adams, L. C.

•preprint•Jun 6 2025

Body mass index (BMI) does not account for substantial inter-individual differences in regional fat and muscle compartments, which are relevant for the prevalence of cardiometabolic and cancer conditions. We applied a validated deep learning pipeline for automated segmentation of whole-body MRI scans in 45,851 adults from the UK Biobank and German National Cohort, enabling harmonized quantification of visceral (VAT), gluteofemoral (GFAT), and abdominal subcutaneous adipose tissue (ASAT), liver fat fraction (LFF), and trunk muscle volume. Associations with clinical conditions were evaluated using compartment measures adjusted for age, sex, height, and BMI. Our analysis demonstrates that regional adiposity and muscle volume show distinct associations with cardiometabolic and cancer prevalence, and that substantial disease heterogeneity exists within BMI strata. The analytic framework and reference data presented here will support future risk stratification efforts and facilitate the integration of automated MRI phenotyping into large-scale population and clinical research.

MRI Segmentation Whole Body Retrospective Clinical In Silico Academic Lab Open Dataset

Filter Papers

Tags

Multimodal AI-driven Biomarker for Early Detection of Cancer Cachexia

Multi-organ AI Endophenotypes Chart the Heterogeneity of Pan-disease in the Brain, Eye, and Heart

Assessing accuracy and legitimacy of multimodal large language models on Japan Diagnostic Radiology Board Examination

Deep learning-enabled MRI phenotyping uncovers regional body composition heterogeneity and disease associations in two European population cohorts

Ready to Sharpen Your Edge?