Latest Papers on Radiology AI. Order: Best Match, Limit: 10.

Insights into a radiology-specialised multimodal large language model with sparse autoencoders

Kenza Bouzid, Shruthi Bannur, Felix Meissen, Daniel Coelho de Castro, Anton Schwaighofer, Javier Alvarez-Valle, Stephanie L. Hyland

•preprint•Jul 17 2025

Interpretability can improve the safety, transparency and trust of AI models, which is especially important in healthcare applications where decisions often carry significant consequences. Mechanistic interpretability, particularly through the use of sparse autoencoders (SAEs), offers a promising approach for uncovering human-interpretable features within large transformer-based models. In this study, we apply Matryoshka-SAE to the radiology-specialised multimodal large language model, MAIRA-2, to interpret its internal representations. Using large-scale automated interpretability of the SAE features, we identify a range of clinically relevant concepts - including medical devices (e.g., line and tube placements, pacemaker presence), pathologies such as pleural effusion and cardiomegaly, longitudinal changes and textual features. We further examine the influence of these features on model behaviour through steering, demonstrating directional control over generations with mixed success. Our results reveal practical and methodological challenges, yet they offer initial insights into the internal concepts learned by MAIRA-2 - marking a step toward deeper mechanistic understanding and interpretability of a radiology-adapted multimodal large language model, and paving the way for improved model transparency. We release the trained SAEs and interpretations: https://huggingface.co/microsoft/maira-2-sae.

Mixed Modality LLM Radiology Report Chest Methodology In Silico Big Tech Open Code GenAI

AortaDiff: Volume-Guided Conditional Diffusion Models for Multi-Branch Aortic Surface Generation

Delin An, Pan Du, Jian-Xun Wang, Chaoli Wang

•preprint•Jul 17 2025

Accurate 3D aortic construction is crucial for clinical diagnosis, preoperative planning, and computational fluid dynamics (CFD) simulations, as it enables the estimation of critical hemodynamic parameters such as blood flow velocity, pressure distribution, and wall shear stress. Existing construction methods often rely on large annotated training datasets and extensive manual intervention. While the resulting meshes can serve for visualization purposes, they struggle to produce geometrically consistent, well-constructed surfaces suitable for downstream CFD analysis. To address these challenges, we introduce AortaDiff, a diffusion-based framework that generates smooth aortic surfaces directly from CT/MRI volumes. AortaDiff first employs a volume-guided conditional diffusion model (CDM) to iteratively generate aortic centerlines conditioned on volumetric medical images. Each centerline point is then automatically used as a prompt to extract the corresponding vessel contour, ensuring accurate boundary delineation. Finally, the extracted contours are fitted into a smooth 3D surface, yielding a continuous, CFD-compatible mesh representation. AortaDiff offers distinct advantages over existing methods, including an end-to-end workflow, minimal dependency on large labeled datasets, and the ability to generate CFD-compatible aorta meshes with high geometric fidelity. Experimental results demonstrate that AortaDiff performs effectively even with limited training data, successfully constructing both normal and pathologically altered aorta meshes, including cases with aneurysms or coarctation. This capability enables the generation of high-quality visualizations and positions AortaDiff as a practical solution for cardiovascular research.

Mixed Modality Segmentation Cardiac Methodology In Silico Academic Lab GenAI

A Deep Learning-Based Ensemble System for Automated Shoulder Fracture Detection in Clinical Radiographs

Hemanth Kumar M, Karthika M, Saianiruth M, Vasanthakumar Venugopal, Anandakumar D, Revathi Ezhumalai, Charulatha K, Kishore Kumar J, Dayana G, Kalyan Sivasailam, Bargava Subramanian

•preprint•Jul 17 2025

Background: Shoulder fractures are often underdiagnosed, especially in emergency and high-volume clinical settings. Studies report up to 10% of such fractures may be missed by radiologists. AI-driven tools offer a scalable way to assist early detection and reduce diagnostic delays. We address this gap through a dedicated AI system for shoulder radiographs. Methods: We developed a multi-model deep learning system using 10,000 annotated shoulder X-rays. Architectures include Faster R-CNN (ResNet50-FPN, ResNeXt), EfficientDet, and RF-DETR. To enhance detection, we applied bounding box and classification-level ensemble techniques such as Soft-NMS, WBF, and NMW fusion. Results: The NMW ensemble achieved 95.5% accuracy and an F1-score of 0.9610, outperforming individual models across all key metrics. It demonstrated strong recall and localization precision, confirming its effectiveness for clinical fracture detection in shoulder X-rays. Conclusion: The results show ensemble-based AI can reliably detect shoulder fractures in radiographs with high clinical relevance. The model's accuracy and deployment readiness position it well for integration into real-time diagnostic workflows. The current model is limited to binary fracture detection, reflecting its design for rapid screening and triage support rather than detailed orthopedic classification.

X-Ray Detection Musculoskeletal Methodology In Silico

A conversational artificial intelligence based web application for medical conversations: a prototype for a chatbot

Pires, J. G.

•preprint•Jul 17 2025

BackgroundArtificial Intelligence (AI) has evolved through various trends, with different subfields gaining prominence over time. Currently, Conversational Artificial Intelligence (CAI)--particularly Generative AI--is at the forefront. CAI models are primarily focused on text-based tasks and are commonly deployed as chatbots. Recent advancements by OpenAI have enabled the integration of external, independently developed models, allowing chatbots to perform specialized, task-oriented functions beyond general language processing. ObjectiveThis study aims to develop a smart chatbot that integrates large language models (LLMs) from OpenAI with specialized domain-specific models, such as those used in medical image diagnostics. The system leverages transfer learning via Googles Teachable Machine to construct image-based classifiers and incorporates a diabetes detection model developed in TensorFlow.js. A key innovation is the chatbots ability to extract relevant parameters from user input, trigger the appropriate diagnostic model, interpret the output, and deliver responses in natural language. The overarching goal is to demonstrate the potential of combining LLMs with external models to build multimodal, task-oriented conversational agents. MethodsTwo image-based models were developed and integrated into the chatbot system. The first analyzes chest X-rays to detect viral and bacterial pneumonia. The second uses optical coherence tomography (OCT) images to identify ocular conditions such as drusen, choroidal neovascularization (CNV), and diabetic macular edema (DME). Both models were incorporated into the chatbot to enable image-based medical query handling. In addition, a text-based model was constructed to process physiological measurements for diabetes prediction using TensorFlow.js. The architecture is modular: new diagnostic models can be added without redesigning the chatbot, enabling straightforward functional expansion. ResultsThe findings demonstrate effective integration between the chatbot and the diagnostic models, with only minor deviations from expected behavior. Additionally, a stub function was implemented within the chatbot to schedule medical appointments based on the severity of a patients condition, and it was specifically tested with the OCT and X-ray models. ConclusionsThis study demonstrates the feasibility of developing advanced AI systems--including image-based diagnostic models and chatbot integration--by leveraging Artificial Intelligence as a Service (AIaaS). It also underscores the potential of AI to enhance user experiences in bioinformatics, paving the way for more intuitive and accessible interfaces in the field. Looking ahead, the modular nature of the chatbot allows for the integration of additional diagnostic models as the system evolves.

Mixed Modality Classification Chest Methodology Prototype Academic Lab GenAI

Predicting ADC map quality from T2-weighted MRI: A deep learning approach for early quality assessment to assist point-of-care.

Brender JR, Ota M, Nguyen N, Ford JW, Kishimoto S, Harmon SA, Wood BJ, Pinto PA, Krishna MC, Choyke PL, Turkbey B

•papers•Jul 17 2025

Poor quality prostate MRI images compromise diagnostic accuracy, with diffusion-weighted imaging and the resulting apparent diffusion coefficient (ADC) maps being particularly vulnerable. These maps are critical for prostate cancer diagnosis, yet current methods relying on standardizing technical parameters fail to consistently ensure image quality. We propose a novel deep learning approach to predict low-quality ADC maps using T2-weighted (T2W) images, enabling real-time corrective interventions during imaging. A multi-site dataset of T2W images and ADC maps from 486 patients, spanning 62 external clinics and in-house imaging, was retrospectively analyzed. A neural network was trained to classify ADC map quality as "diagnostic" or "non-diagnostic" based solely on T2W images. Rectal cross-sectional area measurements were evaluated as an interpretable metric for susceptibility-induced distortions. Analysis revealed limited correlation between individual acquisition parameters and image quality, with horizontal phase encoding significant for T2 imaging (p < 0.001, AUC = 0.6735) and vertical resolution for ADC maps (p = 0.006, AUC = 0.6348). By contrast, the neural network achieved robust performance for ADC map quality prediction from T2 images, with 83 % sensitivity and 90 % negative predictive value in multicenter validation, comparable to single-site models using ADC maps directly. Remarkably, it generalized well to unseen in-house data (94 ± 2 % accuracy). Rectal cross-sectional area correlated with ADC quality (AUC = 0.65), offering a simple, interpretable metric. The probability of low quality, uninterpretable ADC maps can be inferred early in the imaging process by a neural network approach, allowing corrective action to be employed.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab

Patient-Specific and Interpretable Deep Brain Stimulation Optimisation Using MRI and Clinical Review Data

Mikroulis, A., Lasica, A., Filip, P., Bakstein, E., Novak, D.

•preprint•Jul 17 2025

BackgroundOptimisation of Deep Brain Stimulation (DBS) settings is a key aspect in achieving clinical efficacy in movement disorders, such as the Parkinsons disease. Modern techniques attempt to solve the problem through data-intensive statistical and machine learning approaches, adding significant overhead to the existing clinical workflows. Here, we present an optimisation approach for DBS electrode contact and current selection, grounded in routinely collected MRI data, well-established tools (Lead-DBS) and, optionally, clinical review records. MethodsThe pipeline, packaged in a cross-platform tool, uses lead reconstruction data and simulation of volume of tissue activated to estimate the contacts in optimal position relative to the target structure, and suggest optimal stimulation current. The tool then allows further interactive user optimisation of the current settings. Existing electrode contact evaluations can be optionally included in the calculation process for further fine-tuning and adverse effect avoidance. ResultsBased on a sample of 177 implanted electrode reconstructions from 89 Parkinsons disease patients, we demonstrate that DBS parameter setting by our algorithm is more effective in covering the target structure (Wilcoxon p<6e-12, Hedges g>0.34) and minimising electric field leakage to neighbouring regions (p<2e-15, g>0.84) compared to expert parameter settings. ConclusionThe proposed automated method, for optimisation of the DBS electrode contact and current selection shows promising results and is readily applicable to existing clinical workflows. We demonstrate that the algorithmically selected contacts perform better than manual selections according to electric field calculations, allowing for a comparable clinical outcome without the iterative optimisation procedure.

MRI Segmentation Neurological Retrospective Clinical In Silico Academic Lab

Large Language Model-Based Entity Extraction Reliably Classifies Pancreatic Cysts and Reveals Predictors of Malignancy: A Cross-Sectional and Retrospective Cohort Study

Papale, A. J., Flattau, R., Vithlani, N., Mahajan, D., Ziemba, Y., Zavadsky, T., Carvino, A., King, D., Nadella, S.

•preprint•Jul 17 2025

Pancreatic cystic lesions (PCLs) are often discovered incidentally on imaging and may progress to pancreatic ductal adenocarcinoma (PDAC). PCLs have a high incidence in the general population, and adherence to screening guidelines can be variable. With the advent of technologies that enable automated text classification, we sought to evaluate various natural language processing (NLP) tools including large language models (LLMs) for identifying and classifying PCLs from radiology reports. We correlated our classification of PCLs to clinical features to identify risk factors for a positive PDAC biopsy. We contrasted a previously described NLP classifier to LLMs for prospective identification of PCLs in radiology. We evaluated various LLMs for PCL classification into low-risk or high-risk categories based on published guidelines. We compared prompt-based PCL classification to specific entity-guided PCL classification. To this end, we developed tools to deidentify radiology and track patients longitudinally based on their radiology reports. Additionally, we used our newly developed tools to evaluate a retrospective database of patients who underwent pancreas biopsy to determine associated factors including those in their radiology reports and clinical features using multivariable logistic regression modelling. Of 14,574 prospective radiology reports, 665 (4.6%) described a pancreatic cyst, including 175 (1.2%) high-risk lesions. Our Entity-Extraction Large Language Model tool achieved recall 0.992 (95% confidence interval [CI], 0.985-0.998), precision 0.988 (0.979-0.996), and F1-score 0.990 (0.985-0.995) for detecting cysts; F1-scores were 0.993 (0.987-0.998) for low-risk and 0.977 (0.952-0.995) for high-risk classification. Among 4,285 biopsy patients, 330 had pancreatic cysts documented [≥]6 months before biopsy. In the final multivariable model (AUC = 0.877), independent predictors of adenocarcinoma were change in duct caliber with upstream atrophy (adjusted odds ratio [AOR], 4.94; 95% CI, 1.30-18.79), mural nodules (AOR, 11.02; 1.81-67.26), older age (AOR, 1.10; 1.05-1.16), lower body mass index (AOR, 0.86; 0.76-0.96), and total bilirubin (AOR, 1.81; 1.18-2.77). Automated NLP-based analysis of radiology reports using LLM-driven entity extraction can accurately identify and risk-stratify PCLs and, when retrospectively applied, reveal factors predicting malignant progression. Widespread implementation may improve surveillance and enable earlier intervention.

Mixed Modality Classification Abdominal Retrospective Clinical In Silico Academic Lab GenAI

Myocardial Native T1 Mapping in the German National Cohort (NAKO): Associations with Age, Sex, and Cardiometabolic Risk Factors

Ammann, C., Gröschel, J., Saad, H., Rospleszcz, S., Schuppert, C., Hadler, T., Hickstein, R., Niendorf, T., Nolde, J. M., Schulze, M. B., Greiser, K. H., Decker, J. A., Kröncke, T., Küstner, T., Nikolaou, K., Willich, S. N., Keil, T., Dörr, M., Bülow, R., Bamberg, F., Pischon, T., Schlett, C. L., Schulz-Menger, J.

•preprint•Jul 17 2025

Background and AimsIn cardiovascular magnetic resonance (CMR), myocardial native T1 mapping enables quantitative, non-invasive tissue characterization and is sensitive to subclinical changes in myocardial structure and composition. We investigated how age, sex, and cardiometabolic risk factors are associated with myocardial T1 in a population-based analysis within the German National Cohort (NAKO). MethodsThis cross-sectional study included 29,573 prospectively enrolled participants who underwent CMR-based midventricular T1 mapping at 3.0 T, alongside clinical phenotyping. After artificial intelligence-assisted myocardial segmentation, a subset of 9,162 outliers was subjected to manual quality control according to clinical evaluation standards. Associations with cardiometabolic risk factors, identified through self-reported medical history, clinical chemistry, and blood pressure measurements, were evaluated using adjusted linear regression models. ResultsWomen had higher T1 values than men, with sex differences progressively declining with age. T1 was significantly elevated in individuals with diabetes ({beta}=3.91 ms; p<0.001), kidney disease ({beta}=3.44 ms; p<0.001), and current smoking ({beta}=6.67 ms; p<0.001). Conversely, hyperlipidaemia was significantly associated with lower T1 ({beta}=-4.41 ms; p<0.001). Associations with hypertension showed a sex-specific pattern: T1 was lower in women but increased with hypertension severity in men. ConclusionsMyocardial native T1 varies by sex and age and shows associations with major cardiometabolic risk factors. Notably, lower T1 times in participants with hyperlipidaemia may indicate a direct effect of blood lipids on the heart. Our findings support the utility of T1 mapping as a sensitive marker of early myocardial changes and highlight the sex-specific interplay between cardiometabolic health and myocardial tissue composition. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=139 SRC="FIGDIR/small/25331651v1_ufig1.gif" ALT="Figure 1"> View larger version (44K): [email protected]@131514borg.highwire.dtl.DTLVardef@d03877org.highwire.dtl.DTLVardef@2b2fec_HPS_FORMAT_FIGEXP M_FIG C_FIG Key QuestionHow are age, sex, and cardiometabolic risk factors associated with myocardial native T1, a quantitative magnetic resonance imaging marker of myocardial tissue composition, in a large-scale population-based evaluation within the German National Cohort (NAKO)? Key FindingT1 relaxation times were higher in women and gradually converged between sexes with age. Diabetes, kidney disease, smoking, and hypertension in men were associated with prolonged T1 times. Unexpectedly, hyperlipidaemia and hypertension in women showed a negative association with T1. Take-Home MessageNative T1 mapping is sensitive to subclinical myocardial changes and reflects a close interplay between metabolic and myocardial health. It reveals marked age-dependent sex differences and sex-specific responses in myocardial tissue composition to cardiometabolic risk factors.

MRI Segmentation Cardiac Retrospective Clinical In Silico Consortium

Cardiac Function Assessment with Deep-Learning-Based Automatic Segmentation of Free-Running 4D Whole-Heart CMR

Ogier, A. C., Baup, S., Ilanjian, G., Touray, A., Rocca, A., Banus Cobo, J., Monton Quesada, I., Nicoletti, M., Ledoux, J.-B., Richiardi, J., Holtackers, R. J., Yerly, J., Stuber, M., Hullin, R., Rotzinger, D., van Heeswijk, R. B.

•preprint•Jul 17 2025

BackgroundFree-running (FR) cardiac MRI enables free-breathing ECG-free fully dynamic 5D (3D spatial+cardiac+respiration dimensions) imaging but poses significant challenges for clinical integration due to the volume and complexity of image analysis. Existing segmentation methods are tailored to 2D cine or static 3D acquisitions and cannot leverage the unique spatial-temporal wealth of FR data. PurposeTo develop and validate a deep learning (DL)-based segmentation framework for isotropic 3D+cardiac cycle FR cardiac MRI that enables accurate, fast, and clinically meaningful anatomical and functional analysis. MethodsFree-running, contrast-free bSSFP acquisitions at 1.5T and contrast-enhanced GRE acquisitions at 3T were used to reconstruct motion-resolved 5D datasets. From these, the end-expiratory respiratory phase was retained to yield fully isotropic 4D datasets. Automatic propagation of a limited set of manual segmentations was used to segment the left and right ventricular blood pool (LVB, RVB) and left ventricular myocardium (LVM) on reformatted short-axis (SAX) end-systolic (ES) and end-diastolic (ED) images. These were used to train a 3D nnU-Net model. Validation was performed using geometric metrics (Dice similarity coefficient [DSC], relative volume difference [RVD]), clinical metrics (ED and ES volumes, ejection fraction [EF]), and physiological consistency metrics (systole-diastole LVM volume mismatch and LV-RV stroke volume agreement). To assess the robustness and flexibility of the approach, we evaluated multiple additional DL training configurations such as using 4D propagation-based data augmentation to incorporate all cardiac phases into training. ResultsThe main proposed method achieved automatic segmentation within a minute, delivering high geometric accuracy and consistency (DSC: 0.94 {+/-} 0.01 [LVB], 0.86 {+/-} 0.02 [LVM], 0.92 {+/-} 0.01 [RVB]; RVD: 2.7%, 5.8%, 4.5%). Clinical LV metrics showed excellent agreement (ICC > 0.98 for EDV/ESV/EF, bias < 2 mL for EDV/ESV, < 1% for EF), while RV metrics remained clinically reliable (ICC > 0.93 for EDV/ESV/EF, bias < 1 mL for EDV/ESV, < 1% for EF) but exhibited wider limits of agreement. Training on all cardiac phases improved temporal coherence, reducing LVM volume mismatch from 4.0% to 2.6%. ConclusionThis study validates a DL-based method for fast and accurate segmentation of whole-heart free-running 4D cardiac MRI. Robust performance across diverse protocols and evaluation with complementary metrics that match state-of-the-art benchmarks supports its integration into clinical and research workflows, helping to overcome a key barrier to the broader adoption of free-running imaging.

MRI Segmentation Cardiac Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Cross-Modal conditional latent diffusion model for Brain MRI to Ultrasound image translation.

Jiang S, Wang L, Li Y, Yang Z, Zhou Z, Li B

•papers•Jul 16 2025

Intraoperative brain ultrasound (US) provides real-time information on lesions and tissues, making it crucial for brain tumor resection. However, due to limitations such as imaging angles and operator techniques, US data is limited in size and difficult to annotate, hindering advancements in intelligent image processing. In contrast, Magnetic Resonance Imaging (MRI) data is more abundant and easier to annotate. If MRI data and models can be effectively transferred to the US domain, generating high-quality US data would greatly enhance US image processing and improve intraoperative US readability.Approach. We propose a Cross-Modal Conditional Latent Diffusion Model (CCLD) for brain MRI-to-US image translation. We employ a noise mask restoration strategy to pretrain an efficient encoder-decoder, enhancing feature extraction, compression, and reconstruction capabilities while reducing computational costs. Furthermore, CCLD integrates the Frequency-Decomposed Feature Optimization Module (FFOM) and the Adaptive Multi-Frequency Feature Fusion Module (AMFM) to effectively leverage MRI structural information and US texture characteristics, ensuring structural accuracy while enhancing texture details in the synthetic US images.Main results. Compared with state-of-the-art methods, our approach achieves superior performance on the ReMIND dataset, obtaining the best Learned Perceptual Image Patch Similarity (LPIPS) score of 19.1%, Mean Absolute Error (MAE) of 4.21%, as well as the highest Peak Signal-to-Noise Ratio (PSNR) of 25.36 dB and Structural Similarity Index (SSIM) of 86.91%. Significance. Experimental results demonstrate that CCLD effectively improves the quality and realism of synthetic ultrasound images, offering a new research direction for the generation of high-quality US datasets and the enhancement of ultrasound image readability.&#xD.

Mixed Modality Image Synthesis Neurological Methodology In Silico GenAI Open Dataset

Insights into a radiology-specialised multimodal large language model with sparse autoencoders

AortaDiff: Volume-Guided Conditional Diffusion Models for Multi-Branch Aortic Surface Generation

A Deep Learning-Based Ensemble System for Automated Shoulder Fracture Detection in Clinical Radiographs

A conversational artificial intelligence based web application for medical conversations: a prototype for a chatbot

Predicting ADC map quality from T2-weighted MRI: A deep learning approach for early quality assessment to assist point-of-care.

Patient-Specific and Interpretable Deep Brain Stimulation Optimisation Using MRI and Clinical Review Data

Large Language Model-Based Entity Extraction Reliably Classifies Pancreatic Cysts and Reveals Predictors of Malignancy: A Cross-Sectional and Retrospective Cohort Study

Myocardial Native T1 Mapping in the German National Cohort (NAKO): Associations with Age, Sex, and Cardiometabolic Risk Factors

Cardiac Function Assessment with Deep-Learning-Based Automatic Segmentation of Free-Running 4D Whole-Heart CMR

Cross-Modal conditional latent diffusion model for Brain MRI to Ultrasound image translation.

Ready to Sharpen Your Edge?