Latest Papers on Radiology AI. Tags: GenAI

Illuminating radiogenomic signatures in pediatric-type diffuse gliomas: insights into molecular, clinical, and imaging correlations. Part I: high-grade group.

Kurokawa R, Hagiwara A, Ueda D, Ito R, Saida T, Honda M, Nishioka K, Sakata A, Yanagawa M, Takumi K, Oda S, Ide S, Sofue K, Sugawara S, Watabe T, Hirata K, Kawamura M, Iima M, Naganawa S

•papers•Aug 25 2025

Recent advances in molecular genetics have revolutionized the classification of pediatric-type high-grade gliomas in the 2021 World Health Organization central nervous system tumor classification. This narrative review synthesizes current evidence on the following four tumor types: diffuse midline glioma, H3 K27-altered; diffuse hemispheric glioma, H3 G34-mutant; diffuse pediatric-type high-grade glioma, H3-wildtype and IDH-wildtype; and infant-type hemispheric glioma. We conducted a comprehensive literature search for articles published through January 2025. For each tumor type, we analyze characteristic clinical presentations, molecular alterations, conventional and advanced magnetic resonance imaging features, radiological-molecular correlations, and current therapeutic approaches. Emerging radiogenomic approaches utilizing artificial intelligence, including radiomics and deep learning, show promise in identifying imaging biomarkers that correlate with molecular features. This review highlights the importance of integrating radiological and molecular data for accurate diagnosis and treatment planning, while acknowledging limitations in current methodologies and the need for prospective validation in larger cohorts. Understanding these correlations is crucial for advancing personalized treatment strategies for these challenging tumors.

MRI Classification Neurological Review Concept GenAI

ControlEchoSynth: Boosting Ejection Fraction Estimation Models via Controlled Video Diffusion

Nima Kondori, Hanwen Liang, Hooman Vaseli, Bingyu Xie, Christina Luong, Purang Abolmaesumi, Teresa Tsang, Renjie Liao

•preprint•Aug 25 2025

Synthetic data generation represents a significant advancement in boosting the performance of machine learning (ML) models, particularly in fields where data acquisition is challenging, such as echocardiography. The acquisition and labeling of echocardiograms (echo) for heart assessment, crucial in point-of-care ultrasound (POCUS) settings, often encounter limitations due to the restricted number of echo views available, typically captured by operators with varying levels of experience. This study proposes a novel approach for enhancing clinical diagnosis accuracy by synthetically generating echo views. These views are conditioned on existing, real views of the heart, focusing specifically on the estimation of ejection fraction (EF), a critical parameter traditionally measured from biplane apical views. By integrating a conditional generative model, we demonstrate an improvement in EF estimation accuracy, providing a comparative analysis with traditional methods. Preliminary results indicate that our synthetic echoes, when used to augment existing datasets, not only enhance EF estimation but also show potential in advancing the development of more robust, accurate, and clinically relevant ML models. This approach is anticipated to catalyze further research in synthetic data applications, paving the way for innovative solutions in medical imaging diagnostics.

Ultrasound Image Synthesis Cardiac Methodology In Silico GenAI

A Weighted Vision Transformer-Based Multi-Task Learning Framework for Predicting ADAS-Cog Scores

Nur Amirah Abd Hamid, Mohd Ibrahim Shapiai, Daphne Teck Ching Lai

•preprint•Aug 25 2025

Prognostic modeling is essential for forecasting future clinical scores and enabling early detection of Alzheimers disease (AD). While most existing methods focus on predicting the ADAS-Cog global score, they often overlook the predictive value of its 13 sub-scores, which reflect distinct cognitive domains. Some sub-scores may exert greater influence on determining global scores. Assigning higher loss weights to these clinically meaningful sub-scores can guide the model to focus on more relevant cognitive domains, enhancing both predictive accuracy and interpretability. In this study, we propose a weighted Vision Transformer (ViT)-based multi-task learning (MTL) framework to jointly predict the ADAS-Cog global score using baseline MRI scans and its 13 sub-scores at Month 24. Our framework integrates ViT as a feature extractor and systematically investigates the impact of sub-score-specific loss weighting on model performance. Results show that our proposed weighting strategies are group-dependent: strong weighting improves performance for MCI subjects with more heterogeneous MRI patterns, while moderate weighting is more effective for CN subjects with lower variability. Our findings suggest that uniform weighting underutilizes key sub-scores and limits generalization. The proposed framework offers a flexible, interpretable approach to AD prognosis using end-to-end MRI-based learning. (Github repo link will be provided after review)

MRI Registration Neurological Methodology In Silico GenAI Open Code

FCR: Investigating Generative AI models for Forensic Craniofacial Reconstruction

Ravi Shankar Prasad, Dinesh Singh

•preprint•Aug 25 2025

Craniofacial reconstruction in forensics is one of the processes to identify victims of crime and natural disasters. Identifying an individual from their remains plays a crucial role when all other identification methods fail. Traditional methods for this task, such as clay-based craniofacial reconstruction, require expert domain knowledge and are a time-consuming process. At the same time, other probabilistic generative models like the statistical shape model or the Basel face model fail to capture the skull and face cross-domain attributes. Looking at these limitations, we propose a generic framework for craniofacial reconstruction from 2D X-ray images. Here, we used various generative models (i.e., CycleGANs, cGANs, etc) and fine-tune the generator and discriminator parts to generate more realistic images in two distinct domains, which are the skull and face of an individual. This is the first time where 2D X-rays are being used as a representation of the skull by generative models for craniofacial reconstruction. We have evaluated the quality of generated faces using FID, IS, and SSIM scores. Finally, we have proposed a retrieval framework where the query is the generated face image and the gallery is the database of real faces. By experimental results, we have found that this can be an effective tool for forensic science.

X-Ray Image Synthesis Methodology In Silico GenAI

Why Relational Graphs Will Save the Next Generation of Vision Foundation Models?

Fatemeh Ziaeetabar

•preprint•Aug 25 2025

Vision foundation models (FMs) have become the predominant architecture in computer vision, providing highly transferable representations learned from large-scale, multimodal corpora. Nonetheless, they exhibit persistent limitations on tasks that require explicit reasoning over entities, roles, and spatio-temporal relations. Such relational competence is indispensable for fine-grained human activity recognition, egocentric video understanding, and multimodal medical image analysis, where spatial, temporal, and semantic dependencies are decisive for performance. We advance the position that next-generation FMs should incorporate explicit relational interfaces, instantiated as dynamic relational graphs (graphs whose topology and edge semantics are inferred from the input and task context). We illustrate this position with cross-domain evidence from recent systems in human manipulation action recognition and brain tumor segmentation, showing that augmenting FMs with lightweight, context-adaptive graph-reasoning modules improves fine-grained semantic fidelity, out of distribution robustness, interpretability, and computational efficiency relative to FM only baselines. Importantly, by reasoning sparsely over semantic nodes, such hybrids also achieve favorable memory and hardware efficiency, enabling deployment under practical resource constraints. We conclude with a targeted research agenda for FM graph hybrids, prioritizing learned dynamic graph construction, multi-level relational reasoning (e.g., part object scene in activity understanding, or region organ in medical imaging), cross-modal fusion, and evaluation protocols that directly probe relational competence in structured vision tasks.

MRI Segmentation Neurological Review Concept GenAI

Deep learning steganography for big data security using squeeze and excitation with inception architectures.

Issac BM, Kumar SN, Zafar S, Shakil KA, Wani MA

•papers•Aug 25 2025

With the exponential growth of big data in domains such as telemedicine and digital forensics, the secure transmission of sensitive medical information has become a critical concern. Conventional steganographic methods often fail to maintain diagnostic integrity or exhibit robustness against noise and transformations. In this study, we propose a novel deep learning-based steganographic framework that combines Squeeze-and-Excitation (SE) blocks, Inception modules, and residual connections to address these challenges. The encoder integrates dilated convolutions and SE attention to embed secret medical images within natural cover images, while the decoder employs residual and multi-scale Inception-based feature extraction for accurate reconstruction. Designed for deployment on NVIDIA Jetson TX2, the model ensures real-time, low-power operation suitable for edge healthcare applications. Experimental evaluation on MRI and OCT datasets demonstrates the model's efficacy, achieving Peak Signal-to-Noise Ratio (PSNR) values of 39.02 and 38.75, and Structural Similarity Index (SSIM) values of 0.9757, confirming minimal visual distortion. This research contributes to advancing secure, high-capacity steganographic systems for practical use in privacy-sensitive environments.

Mixed Modality Image Synthesis Methodology In Silico Academic Lab GenAI

Application of artificial intelligence chatbots in interpreting magnetic resonance imaging reports: a comparative study.

Bai X, Feng M, Ma W, Liao Y

•papers•Aug 25 2025

Artificial intelligence (AI) chatbots have emerged as promising tools for enhancing medical communication, yet their efficacy in interpreting complex radiological reports remains underexplored. This study evaluates the performance of AI chatbots in translating magnetic resonance imaging (MRI) reports into patient-friendly language and providing clinical recommendations. A cross-sectional analysis was conducted on 6174 MRI reports from tumor patients across three hospitals. Two AI chatbots, GPT o1-preview (Chatbot 1) and Deepseek-R1 (Chatbot 2), were tasked with interpreting reports, classifying tumor characteristics, assessing surgical necessity, and suggesting treatments. Readability was measured using Flesch-Kincaid and Gunning Fog metrics, while accuracy was evaluated by medical reviewers. Statistical analyses included Friedman and Wilcoxon signed-rank tests. Both chatbots significantly improved readability, with Chatbot 2 achieving higher Flesch-Kincaid Reading Ease scores (median: 58.70 vs. 46.00, p < 0.001) and lower text complexity. Chatbot 2 outperformed Chatbot 1 in diagnostic accuracy (92.05% vs. 89.03% for tumor classification; 95.12% vs. 84.73% for surgical necessity, p < 0.001). Treatment recommendations from Chatbot 2 were more clinically relevant (98.10% acceptable vs. 75.41%), though both demonstrated high empathy (92.82-96.11%). Errors included misinterpretations of medical terminology and occasional hallucinations. AI chatbots, particularly Deepseek-R1, effectively enhance the readability and accuracy of MRI report interpretations for patients. However, physician oversight remains critical to mitigate errors. These tools hold potential to reduce healthcare burdens but require further refinement for clinical integration.

MRI LLM Radiology Report Retrospective Clinical In Silico Academic Lab GenAI Benchmark SOTA

Bosniak classification of renal cysts using large language models: a comparative study.

Hacibey I, Kaba E

•papers•Aug 24 2025

The Bosniak classification system is widely used to assess malignancy risk in renal cystic lesions, yet inter-observer variability poses significant challenges. Large language models (LLMs) may offer a standardized approach to classification when provided with textual descriptions, such as those found in radiology reports. This study evaluated the performance of five LLMs-GPT‑4 (ChatGPT), Gemini, Copilot, Perplexity, and NotebookLM-in classifying renal cysts based on synthetic textual descriptions mimicking CT report content. A synthetic dataset of 100 diagnostic scenarios (20 cases per Bosniak category) was constructed using established radiological criteria. Each LLM was evaluated using zero-shot and few-shot prompting strategies, while NotebookLM employed retrieval-augmented generation (RAG). Performance metrics included accuracy, sensitivity, and specificity. Statistical significance was assessed using McNemar's and chi-squared tests. GPT‑4 achieved the highest accuracy (87% zero-shot, 99% few-shot), followed by Copilot (81-86%), Gemini (55-69%), and Perplexity (43-69%). NotebookLM, tested only under RAG conditions, reached 87% accuracy. Few-shot learning significantly improved performance (p < 0.05). Classification of Bosniak IIF lesions remained challenging across models. When provided with well-structured textual descriptions, LLMs can accurately classify renal cysts. Few-shot prompting significantly enhances performance. However, persistent difficulties in classifying borderline lesions such as Bosniak IIF highlight the need for further refinement and real-world validation.

CT Classification Abdominal Methodology In Silico GenAI

OmniMRI: A Unified Vision--Language Foundation Model for Generalist MRI Interpretation

Xingxin He, Aurora Rofena, Ruimin Feng, Haozhe Liao, Zhaoye Zhou, Albert Jang, Fang Liu

•preprint•Aug 24 2025

Magnetic Resonance Imaging (MRI) is indispensable in clinical practice but remains constrained by fragmented, multi-stage workflows encompassing acquisition, reconstruction, segmentation, detection, diagnosis, and reporting. While deep learning has achieved progress in individual tasks, existing approaches are often anatomy- or application-specific and lack generalizability across diverse clinical settings. Moreover, current pipelines rarely integrate imaging data with complementary language information that radiologists rely on in routine practice. Here, we introduce OmniMRI, a unified vision-language foundation model designed to generalize across the entire MRI workflow. OmniMRI is trained on a large-scale, heterogeneous corpus curated from 60 public datasets, over 220,000 MRI volumes and 19 million MRI slices, incorporating image-only data, paired vision-text data, and instruction-response data. Its multi-stage training paradigm, comprising self-supervised vision pretraining, vision-language alignment, multimodal pretraining, and multi-task instruction tuning, progressively equips the model with transferable visual representations, cross-modal reasoning, and robust instruction-following capabilities. Qualitative results demonstrate OmniMRI's ability to perform diverse tasks within a single architecture, including MRI reconstruction, anatomical and pathological segmentation, abnormality detection, diagnostic suggestion, and radiology report generation. These findings highlight OmniMRI's potential to consolidate fragmented pipelines into a scalable, generalist framework, paving the way toward foundation models that unify imaging and clinical language for comprehensive, end-to-end MRI interpretation.

MRI LLM Radiology Report Whole Body Methodology In Silico Academic Lab Breakthrough Open Dataset GenAI

Spectral computed tomography thermometry for thermal ablation: applicability and needle artifact reduction.

Koetzier LR, Hendriks P, Heemskerk JWT, van der Werf NR, Selles M, van der Molen AJ, Smits MLJ, Goorden MC, Burgmans MC

•papers•Aug 23 2025

Effective thermal ablation of liver tumors requires precise monitoring of the ablation zone. Computed tomography (CT) thermometry can non-invasively monitor lethal temperatures but suffers from metal artifacts caused by ablation equipment. This study assesses spectral CT thermometry's applicability during microwave ablation, comparing the reproducibility, precision, and accuracy of attenuation-based versus physical density-based thermometry. Furthermore, it identifies optimal metal artifact reduction (MAR) methods: O-MAR, deep learning-MAR, spectral CT, and combinations thereof. Four gel phantoms embedded with temperature sensors underwent a 10- minute, 60 W microwave ablation imaged by dual-layer spectral CT scanner in 23 scans over time. For each scan attenuation-based and physical density-based temperature maps were reconstructed. Attenuation-based and physical density-based thermometry models were tested for reproducibility over three repetitions; a fourth repetition focused on accuracy. MAR techniques were applied to one repetition to evaluate temperature precision in artifact-corrupted slices. The correlation between CT value and temperature was highly linear with an R-squared value exceeding 96 %. Model parameters for attenuation-based and physical density-based thermometry were -0.38 HU/°C and 0.00039 °C<sup>-1</sup>, with coefficients of variation of 2.3 % and 6.7 %, respectively. Physical density maps improved temperature precision in presence of needle artifacts by 73 % compared to attenuation images. O-MAR improved temperature precision with 49 % compared to no MAR. Attenuation-based thermometry yielded narrower Bland-Altman limits-of-agreement (-7.7 °C to 5.3 °C) than physical density-based thermometry. Spectral physical density-based CT thermometry at 150 keV, utilized alongside O-MAR, enhances temperature precision in presence of metal artifacts and achieves reproducible temperature measurements with high accuracy.

CT Reconstruction Abdominal Methodology Phantom/Animal Academic Lab Reproducibility GenAI

Filter Papers

Tags

Illuminating radiogenomic signatures in pediatric-type diffuse gliomas: insights into molecular, clinical, and imaging correlations. Part I: high-grade group.

ControlEchoSynth: Boosting Ejection Fraction Estimation Models via Controlled Video Diffusion

A Weighted Vision Transformer-Based Multi-Task Learning Framework for Predicting ADAS-Cog Scores

FCR: Investigating Generative AI models for Forensic Craniofacial Reconstruction

Why Relational Graphs Will Save the Next Generation of Vision Foundation Models?

Deep learning steganography for big data security using squeeze and excitation with inception architectures.

Application of artificial intelligence chatbots in interpreting magnetic resonance imaging reports: a comparative study.

Bosniak classification of renal cysts using large language models: a comparative study.

OmniMRI: A Unified Vision--Language Foundation Model for Generalist MRI Interpretation

Spectral computed tomography thermometry for thermal ablation: applicability and needle artifact reduction.

Ready to Sharpen Your Edge?