Latest Papers on Radiology AI. Tags: Mixed Modality

Performance of Machine Learning in Diagnosing KRAS (Kirsten Rat Sarcoma) Mutations in Colorectal Cancer: Systematic Review and Meta-Analysis.

Chen K, Qu Y, Han Y, Li Y, Gao H, Zheng D

•papers•Jul 18 2025

With the widespread application of machine learning (ML) in the diagnosis and treatment of colorectal cancer (CRC), some studies have investigated the use of ML techniques for the diagnosis of KRAS (Kirsten rat sarcoma) mutation. Nevertheless, there is scarce evidence from evidence-based medicine to substantiate its efficacy. Our study was carried out to systematically review the performance of ML models developed using different modeling approaches, in diagnosing KRAS mutations in CRC. We aim to offer evidence-based foundations for the development and enhancement of future intelligent diagnostic tools. PubMed, Cochrane Library, Embase, and Web of Science were systematically retrieved, with the search cutoff date set to December 22, 2024. The encompassed studies are publicly published research papers that use ML to diagnose KRAS gene mutations in CRC. The risk of bias in the encompassed models was evaluated via the PROBAST (Prediction Model Risk of Bias Assessment Tool). A meta-analysis of the model's concordance index (c-index) was performed, and a bivariate mixed-effects model was used to summarize sensitivity and specificity based on diagnostic contingency tables. A total of 43 studies involving 10,888 patients were included. The modeling variables were derived from clinical characteristics, computed tomography (CT), magnetic resonance imaging (MRI), positron emission tomography/computed tomography, and pathological histology. In the validation cohort, for the ML model developed based on CT radiomic features, the c-index, sensitivity, and specificity were 0.87 (95% CI 0.84-0.90), 0.85 (95% CI 0.80-0.89), and 0.83 (95% CI 0.73-0.89), respectively. For the model developed using MRI radiomic features, the c-index, sensitivity, and specificity were 0.77 (95% CI 0.71-0.83), 0.78 (95% CI 0.72-0.83), and 0.73 (95% CI 0.63-0.81), respectively. For the ML model developed based on positron emission tomography/computed tomography radiomic features, the c-index, sensitivity, and specificity were 0.84 (95% CI 0.77-0.90), 0.73, and 0.83, respectively. Notably, the deep learning (DL) model based on pathological images demonstrated a c-index, sensitivity, and specificity of 0.96 (95% CI 0.94-0.98), 0.83 (95% CI 0.72-0.91), and 0.87 (95% CI 0.77-0.92), respectively. The DL model MRI-based model showed a c-index of 0.93 (95% CI 0.90-0.96), sensitivity of 0.85 (95% CI 0.75-0.91), and specificity of 0.83 (95% CI 0.77-0.88). ML is highly accurate in diagnosing KRAS mutations in CRC, and DL models based on MRI and pathological images exhibit particularly strong diagnosis accuracy. More broadly applicable DL-based diagnostic tools may be developed in the future. However, the clinical application of DL models remains relatively limited at present. Therefore, future research should focus on increasing sample sizes, improving model architectures, and developing more advanced DL models to facilitate the creation of highly efficient intelligent diagnostic tools for KRAS mutation diagnosis in CRC.

Mixed Modality Classification Abdominal Meta Analysis In Silico

Imaging biomarkers of ageing: a review of artificial intelligence-based approaches for age estimation.

Haugg F, Lee G, He J, Johnson J, Zapaishchykova A, Bitterman DS, Kann BH, Aerts HJWL, Mak RH

•papers•Jul 18 2025

Chronological age, although commonly used in clinical practice, fails to capture individual variations in rates of ageing and physiological decline. Recent advances in artificial intelligence (AI) have transformed the estimation of biological age using various imaging techniques. This Review consolidates AI developments in age prediction across brain, chest, abdominal, bone, and facial imaging using diverse methods, including MRI, CT, x-ray, and photographs. The difference between predicted and chronological age-often referred to as age deviation-is a promising biomarker for assessing health status and predicting disease risk. In this Review, we highlight consistent associations between age deviation and various health outcomes, including mortality risk, cognitive decline, and cardiovascular prognosis. We also discuss the technical challenges in developing unbiased models and ethical considerations for clinical application. This Review highlights the potential of AI-based age estimation in personalised medicine as it offers a non-invasive, interpretable biomarker that could transform health risk assessment and guide preventive interventions.

Mixed Modality Classification Whole Body Review Concept Ethics

AI Prognostication in Nonsmall Cell Lung Cancer: A Systematic Review.

Augustin M, Lyons K, Kim H, Kim DG, Kim Y

•papers•Jul 18 2025

The systematic literature review was performed on the use of artificial intelligence (AI) algorithms in nonsmall cell lung cancer (NSCLC) prognostication. Studies were evaluated for the type of input data (histology and whether CT, PET, and MRI were used), cancer therapy intervention, prognosis performance, and comparisons to clinical prognosis systems such as TNM staging. Further comparisons were drawn between different types of AI, such as machine learning (ML) and deep learning (DL). Syntheses of therapeutic interventions and algorithm input modalities were performed for comparison purposes. The review adheres to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA). The initial database identified 3880 results, which were reduced to 513 after the automatic screening, and 309 after the exclusion criteria. The prognostic performance of AI for NSCLC has been investigated using histology and genetic data, and CT, PET, and MR imaging for surgery, immunotherapy, and radiation therapy patients with and without chemotherapy. Studies per therapy intervention were 13 for immunotherapy, 10 for radiotherapy, 14 for surgery, and 34 for other, multiple, or no specific therapy. The results of this systematic review demonstrate that AI-based prognostication methods consistently present higher prognostic performance for NSCLC, especially when directly compared with traditional prognostication techniques such as TNM staging. The use of DL outperforms ML-based prognostication techniques. DL-based prognostication demonstrates the potential for personalized precision cancer therapy as a supplementary decision-making tool. Before it is fully utilized in clinical practice, it is recommended that it be thoroughly validated through well-designed clinical trials.

Mixed Modality Classification Chest Review Concept Academic Lab Benchmark SOTA

Clinical Translation of Integrated PET-MRI for Neurodegenerative Disease.

Shepherd TM, Dogra S

•papers•Jul 18 2025

The prevalence of Alzheimer's disease and other dementias is increasing as populations live longer lifespans. Imaging is becoming a key component of the workup for patients with cognitive impairment or dementia. Integrated PET-MRI provides a unique opportunity for same-session multimodal characterization with many practical benefits to patients, referring physicians, radiologists, and researchers. The impact of integrated PET-MRI on clinical practice for early adopters of this technology can be profound. Classic imaging findings with integrated PET-MRI are illustrated for common neurodegenerative diseases or clinical-radiological syndromes. This review summarizes recent technical innovations that are being introduced into PET-MRI clinical practice and research for neurodegenerative disease. More recent MRI-based attenuation correction now performs similarly compared to PET-CT (e.g., whole-brain bias < 0.5%) such that early concerns for accurate PET tracer quantification with integrated PET-MRI appear resolved. Head motion is common in this patient population. MRI- and PET data-driven motion correction appear ready for routine use and should substantially improve PET-MRI image quality. PET-MRI by definition eliminates ~50% of the radiation from CT. Multiple hardware and software techniques for improving image quality with lower counts are reviewed (including motion correction). These methods can lower radiation to patients (and staff), increase scanner throughput, and generate better temporal resolution for dynamic PET. Deep learning has been broadly applied to PET-MRI. Deep learning analysis of PET and MRI data may provide accurate classification of different stages of Alzheimer's disease or predict progression to dementia. Over the past 5 years, clinical imaging of neurodegenerative disease has changed due to imaging research and the introduction of anti-amyloid immunotherapy-integrated PET-MRI is best suited for imaging these patients and its use appears poised for rapid growth outside academic medical centers. Evidence level: 5. Technical efficacy: Stage 3.

Mixed Modality Classification Neurological Review Clinical Pilot Academic Lab GenAI

Large Language Model-Based Entity Extraction Reliably Classifies Pancreatic Cysts and Reveals Predictors of Malignancy: A Cross-Sectional and Retrospective Cohort Study

Papale, A. J., Flattau, R., Vithlani, N., Mahajan, D., Ziemba, Y., Zavadsky, T., Carvino, A., King, D., Nadella, S.

•preprint•Jul 17 2025

Pancreatic cystic lesions (PCLs) are often discovered incidentally on imaging and may progress to pancreatic ductal adenocarcinoma (PDAC). PCLs have a high incidence in the general population, and adherence to screening guidelines can be variable. With the advent of technologies that enable automated text classification, we sought to evaluate various natural language processing (NLP) tools including large language models (LLMs) for identifying and classifying PCLs from radiology reports. We correlated our classification of PCLs to clinical features to identify risk factors for a positive PDAC biopsy. We contrasted a previously described NLP classifier to LLMs for prospective identification of PCLs in radiology. We evaluated various LLMs for PCL classification into low-risk or high-risk categories based on published guidelines. We compared prompt-based PCL classification to specific entity-guided PCL classification. To this end, we developed tools to deidentify radiology and track patients longitudinally based on their radiology reports. Additionally, we used our newly developed tools to evaluate a retrospective database of patients who underwent pancreas biopsy to determine associated factors including those in their radiology reports and clinical features using multivariable logistic regression modelling. Of 14,574 prospective radiology reports, 665 (4.6%) described a pancreatic cyst, including 175 (1.2%) high-risk lesions. Our Entity-Extraction Large Language Model tool achieved recall 0.992 (95% confidence interval [CI], 0.985-0.998), precision 0.988 (0.979-0.996), and F1-score 0.990 (0.985-0.995) for detecting cysts; F1-scores were 0.993 (0.987-0.998) for low-risk and 0.977 (0.952-0.995) for high-risk classification. Among 4,285 biopsy patients, 330 had pancreatic cysts documented [≥]6 months before biopsy. In the final multivariable model (AUC = 0.877), independent predictors of adenocarcinoma were change in duct caliber with upstream atrophy (adjusted odds ratio [AOR], 4.94; 95% CI, 1.30-18.79), mural nodules (AOR, 11.02; 1.81-67.26), older age (AOR, 1.10; 1.05-1.16), lower body mass index (AOR, 0.86; 0.76-0.96), and total bilirubin (AOR, 1.81; 1.18-2.77). Automated NLP-based analysis of radiology reports using LLM-driven entity extraction can accurately identify and risk-stratify PCLs and, when retrospectively applied, reveal factors predicting malignant progression. Widespread implementation may improve surveillance and enable earlier intervention.

Mixed Modality Classification Abdominal Retrospective Clinical In Silico Academic Lab GenAI

Early Vascular Aging Determined by 3-Dimensional Aortic Geometry: Genetic Determinants and Clinical Consequences.

Beeche C, Zhao B, Tavolinejad H, Pourmussa B, Kim J, Duda J, Gee J, Witschey WR, Chirinos JA

•papers•Jul 17 2025

Vascular aging is an important phenotype characterized by structural and geometric remodeling. Some individuals exhibit supernormal vascular aging, associated with improved cardiovascular outcomes; others experience early vascular aging, linked to adverse cardiovascular outcomes. The aorta is the artery that exhibits the most prominent age-related changes; however, the biological mechanisms underlying aortic aging, its genetic architecture, and its relationship with cardiovascular structure, function, and disease states remain poorly understood. We developed sex-specific models to quantify aortic age on the basis of aortic geometric phenotypes derived from 3-dimensional tomographic imaging data in 2 large biobanks: the UK Biobank and the Penn Medicine BioBank. Convolutional neural ne2rk-assisted 3-dimensional segmentation of the aorta was performed in 56 104 magnetic resonance imaging scans in the UK Biobank and 6757 computed tomography scans in the Penn Medicine BioBank. Aortic vascular age index (AVAI) was calculated as the difference between the vascular age predicted from geometric phenotypes and the chronological age, expressed as a percent of chronological age. We assessed associations with cardiovascular structure and function using multivariate linear regression and examined the genetic architecture of AVAI through genome-wide association studies, followed by Mendelian randomization to assess causal associations. We also constructed a polygenic risk score for AVAI. AVAI displayed numerous associations with cardiac structure and function, including increased left ventricular mass (standardized β=0.144 [95% CI, 0.138, 0.149]; P<0.0001), wall thickness (standardized β=0.061 [95% CI, 0.054, 0.068]; P<0.0001), and left atrial volume maximum (standardized β=0.060 [95% CI, 0.050, 0.069]; P<0.0001). AVAI exhibited high genetic heritability (h2=40.24%). We identified 54 independent genetic loci (P<5×10-8) associated with AVAI, which further exhibited gene-level associations with the fibrillin-1 (FBN1) and elastin (ELN1) genes. Mendelian randomization supported causal associations between AVAI and atrial fibrillation, vascular dementia, aortic aneurysm, and aortic dissection. A polygenic risk score for AVAI was associated with an increased prevalence of atrial fibrillation, hypertension, aortic aneurysm, and aortic dissection. Early aortic aging is significantly associated with adverse cardiac remodeling and important cardiovascular disease states. AVAI exhibits a polygenic, highly heritable genetic architecture. Mendelian randomization analyses support a causal association between AVAI and cardiovascular diseases, including atrial fibrillation, vascular dementia, aortic aneurysms, and aortic dissection.

Mixed Modality Segmentation Vascular Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Insights into a radiology-specialised multimodal large language model with sparse autoencoders

Kenza Bouzid, Shruthi Bannur, Daniel Coelho de Castro, Anton Schwaighofer, Javier Alvarez-Valle, Stephanie L. Hyland

•preprint•Jul 17 2025

Interpretability can improve the safety, transparency and trust of AI models, which is especially important in healthcare applications where decisions often carry significant consequences. Mechanistic interpretability, particularly through the use of sparse autoencoders (SAEs), offers a promising approach for uncovering human-interpretable features within large transformer-based models. In this study, we apply Matryoshka-SAE to the radiology-specialised multimodal large language model, MAIRA-2, to interpret its internal representations. Using large-scale automated interpretability of the SAE features, we identify a range of clinically relevant concepts - including medical devices (e.g., line and tube placements, pacemaker presence), pathologies such as pleural effusion and cardiomegaly, longitudinal changes and textual features. We further examine the influence of these features on model behaviour through steering, demonstrating directional control over generations with mixed success. Our results reveal practical and methodological challenges, yet they offer initial insights into the internal concepts learned by MAIRA-2 - marking a step toward deeper mechanistic understanding and interpretability of a radiology-adapted multimodal large language model, and paving the way for improved model transparency. We release the trained SAEs and interpretations: https://huggingface.co/microsoft/maira-2-sae.

Mixed Modality LLM Radiology Report Chest Methodology In Silico Big Tech Open Code

Domain-randomized deep learning for neuroimage analysis

Malte Hoffmann

•preprint•Jul 17 2025

Deep learning has revolutionized neuroimage analysis by delivering unprecedented speed and accuracy. However, the narrow scope of many training datasets constrains model robustness and generalizability. This challenge is particularly acute in magnetic resonance imaging (MRI), where image appearance varies widely across pulse sequences and scanner hardware. A recent domain-randomization strategy addresses the generalization problem by training deep neural networks on synthetic images with randomized intensities and anatomical content. By generating diverse data from anatomical segmentation maps, the approach enables models to accurately process image types unseen during training, without retraining or fine-tuning. It has demonstrated effectiveness across modalities including MRI, computed tomography, positron emission tomography, and optical coherence tomography, as well as beyond neuroimaging in ultrasound, electron and fluorescence microscopy, and X-ray microtomography. This tutorial paper reviews the principles, implementation, and potential of the synthesis-driven training paradigm. It highlights key benefits, such as improved generalization and resistance to overfitting, while discussing trade-offs such as increased computational demands. Finally, the article explores practical considerations for adopting the technique, aiming to accelerate the development of generalizable tools that make deep learning more accessible to domain experts without extensive computational resources or machine learning knowledge.

Mixed Modality Segmentation Neurological Review Concept GenAI

A conversational artificial intelligence based web application for medical conversations: a prototype for a chatbot

Pires, J. G.

•preprint•Jul 17 2025

BackgroundArtificial Intelligence (AI) has evolved through various trends, with different subfields gaining prominence over time. Currently, Conversational Artificial Intelligence (CAI)--particularly Generative AI--is at the forefront. CAI models are primarily focused on text-based tasks and are commonly deployed as chatbots. Recent advancements by OpenAI have enabled the integration of external, independently developed models, allowing chatbots to perform specialized, task-oriented functions beyond general language processing. ObjectiveThis study aims to develop a smart chatbot that integrates large language models (LLMs) from OpenAI with specialized domain-specific models, such as those used in medical image diagnostics. The system leverages transfer learning via Googles Teachable Machine to construct image-based classifiers and incorporates a diabetes detection model developed in TensorFlow.js. A key innovation is the chatbots ability to extract relevant parameters from user input, trigger the appropriate diagnostic model, interpret the output, and deliver responses in natural language. The overarching goal is to demonstrate the potential of combining LLMs with external models to build multimodal, task-oriented conversational agents. MethodsTwo image-based models were developed and integrated into the chatbot system. The first analyzes chest X-rays to detect viral and bacterial pneumonia. The second uses optical coherence tomography (OCT) images to identify ocular conditions such as drusen, choroidal neovascularization (CNV), and diabetic macular edema (DME). Both models were incorporated into the chatbot to enable image-based medical query handling. In addition, a text-based model was constructed to process physiological measurements for diabetes prediction using TensorFlow.js. The architecture is modular: new diagnostic models can be added without redesigning the chatbot, enabling straightforward functional expansion. ResultsThe findings demonstrate effective integration between the chatbot and the diagnostic models, with only minor deviations from expected behavior. Additionally, a stub function was implemented within the chatbot to schedule medical appointments based on the severity of a patients condition, and it was specifically tested with the OCT and X-ray models. ConclusionsThis study demonstrates the feasibility of developing advanced AI systems--including image-based diagnostic models and chatbot integration--by leveraging Artificial Intelligence as a Service (AIaaS). It also underscores the potential of AI to enhance user experiences in bioinformatics, paving the way for more intuitive and accessible interfaces in the field. Looking ahead, the modular nature of the chatbot allows for the integration of additional diagnostic models as the system evolves.

Mixed Modality Classification Chest Methodology Prototype Academic Lab GenAI

Insights into a radiology-specialised multimodal large language model with sparse autoencoders

Kenza Bouzid, Shruthi Bannur, Felix Meissen, Daniel Coelho de Castro, Anton Schwaighofer, Javier Alvarez-Valle, Stephanie L. Hyland

•preprint•Jul 17 2025

Mixed Modality LLM Radiology Report Chest Methodology In Silico Big Tech Open Code GenAI

Filter Papers

Tags

Performance of Machine Learning in Diagnosing KRAS (Kirsten Rat Sarcoma) Mutations in Colorectal Cancer: Systematic Review and Meta-Analysis.

Imaging biomarkers of ageing: a review of artificial intelligence-based approaches for age estimation.

AI Prognostication in Nonsmall Cell Lung Cancer: A Systematic Review.

Clinical Translation of Integrated PET-MRI for Neurodegenerative Disease.

Large Language Model-Based Entity Extraction Reliably Classifies Pancreatic Cysts and Reveals Predictors of Malignancy: A Cross-Sectional and Retrospective Cohort Study

Early Vascular Aging Determined by 3-Dimensional Aortic Geometry: Genetic Determinants and Clinical Consequences.

Insights into a radiology-specialised multimodal large language model with sparse autoencoders

Domain-randomized deep learning for neuroimage analysis

A conversational artificial intelligence based web application for medical conversations: a prototype for a chatbot

Insights into a radiology-specialised multimodal large language model with sparse autoencoders

Ready to Sharpen Your Edge?