Latest Papers on Radiology AI. Tags: Mixed Modality

Early Vascular Aging Determined by 3-Dimensional Aortic Geometry: Genetic Determinants and Clinical Consequences.

Beeche C, Zhao B, Tavolinejad H, Pourmussa B, Kim J, Duda J, Gee J, Witschey WR, Chirinos JA

•papers•Jul 17 2025

Vascular aging is an important phenotype characterized by structural and geometric remodeling. Some individuals exhibit supernormal vascular aging, associated with improved cardiovascular outcomes; others experience early vascular aging, linked to adverse cardiovascular outcomes. The aorta is the artery that exhibits the most prominent age-related changes; however, the biological mechanisms underlying aortic aging, its genetic architecture, and its relationship with cardiovascular structure, function, and disease states remain poorly understood. We developed sex-specific models to quantify aortic age on the basis of aortic geometric phenotypes derived from 3-dimensional tomographic imaging data in 2 large biobanks: the UK Biobank and the Penn Medicine BioBank. Convolutional neural ne2rk-assisted 3-dimensional segmentation of the aorta was performed in 56 104 magnetic resonance imaging scans in the UK Biobank and 6757 computed tomography scans in the Penn Medicine BioBank. Aortic vascular age index (AVAI) was calculated as the difference between the vascular age predicted from geometric phenotypes and the chronological age, expressed as a percent of chronological age. We assessed associations with cardiovascular structure and function using multivariate linear regression and examined the genetic architecture of AVAI through genome-wide association studies, followed by Mendelian randomization to assess causal associations. We also constructed a polygenic risk score for AVAI. AVAI displayed numerous associations with cardiac structure and function, including increased left ventricular mass (standardized β=0.144 [95% CI, 0.138, 0.149]; P<0.0001), wall thickness (standardized β=0.061 [95% CI, 0.054, 0.068]; P<0.0001), and left atrial volume maximum (standardized β=0.060 [95% CI, 0.050, 0.069]; P<0.0001). AVAI exhibited high genetic heritability (h2=40.24%). We identified 54 independent genetic loci (P<5×10-8) associated with AVAI, which further exhibited gene-level associations with the fibrillin-1 (FBN1) and elastin (ELN1) genes. Mendelian randomization supported causal associations between AVAI and atrial fibrillation, vascular dementia, aortic aneurysm, and aortic dissection. A polygenic risk score for AVAI was associated with an increased prevalence of atrial fibrillation, hypertension, aortic aneurysm, and aortic dissection. Early aortic aging is significantly associated with adverse cardiac remodeling and important cardiovascular disease states. AVAI exhibits a polygenic, highly heritable genetic architecture. Mendelian randomization analyses support a causal association between AVAI and cardiovascular diseases, including atrial fibrillation, vascular dementia, aortic aneurysms, and aortic dissection.

Mixed Modality Segmentation Vascular Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Cross-Modal conditional latent diffusion model for Brain MRI to Ultrasound image translation.

Jiang S, Wang L, Li Y, Yang Z, Zhou Z, Li B

•papers•Jul 16 2025

Intraoperative brain ultrasound (US) provides real-time information on lesions and tissues, making it crucial for brain tumor resection. However, due to limitations such as imaging angles and operator techniques, US data is limited in size and difficult to annotate, hindering advancements in intelligent image processing. In contrast, Magnetic Resonance Imaging (MRI) data is more abundant and easier to annotate. If MRI data and models can be effectively transferred to the US domain, generating high-quality US data would greatly enhance US image processing and improve intraoperative US readability.Approach. We propose a Cross-Modal Conditional Latent Diffusion Model (CCLD) for brain MRI-to-US image translation. We employ a noise mask restoration strategy to pretrain an efficient encoder-decoder, enhancing feature extraction, compression, and reconstruction capabilities while reducing computational costs. Furthermore, CCLD integrates the Frequency-Decomposed Feature Optimization Module (FFOM) and the Adaptive Multi-Frequency Feature Fusion Module (AMFM) to effectively leverage MRI structural information and US texture characteristics, ensuring structural accuracy while enhancing texture details in the synthetic US images.Main results. Compared with state-of-the-art methods, our approach achieves superior performance on the ReMIND dataset, obtaining the best Learned Perceptual Image Patch Similarity (LPIPS) score of 19.1%, Mean Absolute Error (MAE) of 4.21%, as well as the highest Peak Signal-to-Noise Ratio (PSNR) of 25.36 dB and Structural Similarity Index (SSIM) of 86.91%. Significance. Experimental results demonstrate that CCLD effectively improves the quality and realism of synthetic ultrasound images, offering a new research direction for the generation of high-quality US datasets and the enhancement of ultrasound image readability.&#xD.

Mixed Modality Image Synthesis Neurological Methodology In Silico GenAI Open Dataset

Multimodal Large Language Model With Knowledge Retrieval Using Flowchart Embedding for Forming Follow-Up Recommendations for Pancreatic Cystic Lesions.

Zhu Z, Liu J, Hong CW, Houshmand S, Wang K, Yang Y

•papers•Jul 16 2025

BACKGROUND. The American College of Radiology (ACR) Incidental Findings Committee (IFC) algorithm provides guidance for pancreatic cystic lesion (PCL) management. Its implementation using plain-text large language model (LLM) solutions is challenging given that key components include multimodal data (e.g., figures and tables). OBJECTIVE. The purpose of the study is to evaluate a multimodal LLM approach incorporating knowledge retrieval using flowchart embedding for forming follow-up recommendations for PCL management. METHODS. This retrospective study included patients who underwent abdominal CT or MRI from September 1, 2023, to September 1, 2024, and whose report mentioned a PCL. The reports' Findings sections were inputted to a multimodal LLM (GPT-4o). For task 1 (198 patients: mean age, 69.0 ± 13.0 [SD] years; 110 women, 88 men), the LLM assessed PCL features (presence of PCL, PCL size and location, presence of main pancreatic duct communication, presence of worrisome features or high-risk stigmata) and formed a follow-up recommendation using three knowledge retrieval methods (default knowledge, plain-text retrieval-augmented generation [RAG] from the ACR IFC algorithm PDF document, and flowchart embedding using the LLM's image-to-text conversion for in-context integration of the document's flowcharts and tables). For task 2 (85 patients: mean initial age, 69.2 ± 10.8 years; 48 women, 37 men), an additional relevant prior report was inputted; the LLM assessed for interval PCL change and provided an adjusted follow-up schedule accounting for prior imaging using flowchart embedding. Three radiologists assessed LLM accuracy in task 1 for PCL findings in consensus and follow-up recommendations independently; one radiologist assessed accuracy in task 2. RESULTS. For task 1, the LLM with flowchart embedding had accuracy for PCL features of 98.0-99.0%. The accuracy of the LLM follow-up recommendations based on default knowledge, plain-text RAG, and flowchart embedding for radiologist 1 was 42.4%, 23.7%, and 89.9% (p < .001), respectively; radiologist 2 was 39.9%, 24.2%, and 91.9% (p < .001); and radiologist 3 was 40.9%, 25.3%, and 91.9% (p < .001). For task 2, the LLM using flowchart embedding showed an accuracy for interval PCL change of 96.5% and for adjusted follow-up schedules of 81.2%. CONCLUSION. Multimodal flowchart embedding aided the LLM's automated provision of follow-up recommendations adherent to a clinical guidance document. CLINICAL IMPACT. The framework could be extended to other incidental findings through the use of other clinical guidance documents as the model input.

Mixed Modality LLM Radiology Report Abdominal Retrospective Clinical In Silico Academic Lab GenAI

Comparative Analysis of CNN Performance in Keras, PyTorch and JAX on PathMNIST

Anida Nezović, Jalal Romano, Nada Marić, Medina Kapo, Amila Akagić

•preprint•Jul 16 2025

Deep learning has significantly advanced the field of medical image classification, particularly with the adoption of Convolutional Neural Networks (CNNs). Various deep learning frameworks such as Keras, PyTorch and JAX offer unique advantages in model development and deployment. However, their comparative performance in medical imaging tasks remains underexplored. This study presents a comprehensive analysis of CNN implementations across these frameworks, using the PathMNIST dataset as a benchmark. We evaluate training efficiency, classification accuracy and inference speed to assess their suitability for real-world applications. Our findings highlight the trade-offs between computational speed and model accuracy, offering valuable insights for researchers and practitioners in medical image analysis.

Mixed Modality Classification Methodology In Silico Benchmark SOTA

CytoSAE: Interpretable Cell Embeddings for Hematology

Muhammed Furkan Dasdelen, Hyesu Lim, Michele Buck, Katharina S. Götze, Carsten Marr, Steffen Schneider

•preprint•Jul 16 2025

Sparse autoencoders (SAEs) emerged as a promising tool for mechanistic interpretability of transformer-based foundation models. Very recently, SAEs were also adopted for the visual domain, enabling the discovery of visual concepts and their patch-wise attribution to tokens in the transformer model. While a growing number of foundation models emerged for medical imaging, tools for explaining their inferences are still lacking. In this work, we show the applicability of SAEs for hematology. We propose CytoSAE, a sparse autoencoder which is trained on over 40,000 peripheral blood single-cell images. CytoSAE generalizes to diverse and out-of-domain datasets, including bone marrow cytology, where it identifies morphologically relevant concepts which we validated with medical experts. Furthermore, we demonstrate scenarios in which CytoSAE can generate patient-specific and disease-specific concepts, enabling the detection of pathognomonic cells and localized cellular abnormalities at the patch level. We quantified the effect of concepts on a patient-level AML subtype classification task and show that CytoSAE concepts reach performance comparable to the state-of-the-art, while offering explainability on the sub-cellular level. Source code and model weights are available at https://github.com/dynamical-inference/cytosae.

Mixed Modality Classification Methodology In Silico Academic Lab Open Code

COLI: A Hierarchical Efficient Compressor for Large Images

Haoran Wang, Hanyu Pei, Yang Lyu, Kai Zhang, Li Li, Feng-Lei Fan

•preprint•Jul 15 2025

The escalating adoption of high-resolution, large-field-of-view imagery amplifies the need for efficient compression methodologies. Conventional techniques frequently fail to preserve critical image details, while data-driven approaches exhibit limited generalizability. Implicit Neural Representations (INRs) present a promising alternative by learning continuous mappings from spatial coordinates to pixel intensities for individual images, thereby storing network weights rather than raw pixels and avoiding the generalization problem. However, INR-based compression of large images faces challenges including slow compression speed and suboptimal compression ratios. To address these limitations, we introduce COLI (Compressor for Large Images), a novel framework leveraging Neural Representations for Videos (NeRV). First, recognizing that INR-based compression constitutes a training process, we accelerate its convergence through a pretraining-finetuning paradigm, mixed-precision training, and reformulation of the sequential loss into a parallelizable objective. Second, capitalizing on INRs' transformation of image storage constraints into weight storage, we implement Hyper-Compression, a novel post-training technique to substantially enhance compression ratios while maintaining minimal output distortion. Evaluations across two medical imaging datasets demonstrate that COLI consistently achieves competitive or superior PSNR and SSIM metrics at significantly reduced bits per pixel (bpp), while accelerating NeRV training by up to 4 times.

Mixed Modality Reconstruction Methodology In Silico Academic Lab

A diffusion model for universal medical image enhancement.

Fei B, Li Y, Yang W, Gao H, Xu J, Ma L, Yang Y, Zhou P

•papers•Jul 15 2025

The development of medical imaging techniques has made a significant contribution to clinical decision-making. However, the existence of suboptimal imaging quality, as indicated by irregular illumination or imbalanced intensity, presents significant obstacles in automating disease screening, analysis, and diagnosis. Existing approaches for natural image enhancement are mostly trained with numerous paired images, presenting challenges in data collection and training costs, all while lacking the ability to generalize effectively. Here, we introduce a pioneering training-free Diffusion Model for Universal Medical Image Enhancement, named UniMIE. UniMIE demonstrates its unsupervised enhancement capabilities across various medical image modalities without the need for any fine-tuning. It accomplishes this by relying solely on a single pre-trained model from ImageNet. We conduct a comprehensive evaluation on 13 imaging modalities and over 15 medical types, demonstrating better qualities, robustness, and accuracy than other modality-specific and data-inefficient models. By delivering high-quality enhancement and corresponding accuracy downstream tasks across a wide range of tasks, UniMIE exhibits considerable potential to accelerate the advancement of diagnostic tools and customized treatment plans. UniMIE represents a transformative approach to medical image enhancement, offering a versatile and robust solution that adapts to diverse imaging conditions. By improving image quality and facilitating better downstream analyses, UniMIE has the potential to revolutionize clinical workflows and enhance diagnostic accuracy across a wide range of medical applications.

Mixed Modality Image Synthesis Methodology In Silico Academic Lab Breakthrough

Learning homeomorphic image registration via conformal-invariant hyperelastic regularisation.

Zou J, Debroux N, Liu L, Qin J, Schönlieb CB, Aviles-Rivero AI

•papers•Jul 15 2025

Deformable image registration is a fundamental task in medical image analysis and plays a crucial role in a wide range of clinical applications. Recently, deep learning-based approaches have been widely studied for deformable medical image registration and achieved promising results. However, existing deep learning image registration techniques do not theoretically guarantee topology-preserving transformations. This is a key property to preserve anatomical structures and achieve plausible transformations that can be used in real clinical settings. We propose a novel framework for deformable image registration. Firstly, we introduce a novel regulariser based on conformal-invariant properties in a nonlinear elasticity setting. Our regulariser enforces the deformation field to be mooth, invertible and orientation-preserving. More importantly, we strictly guarantee topology preservation yielding to a clinical meaningful registration. Secondly, we boost the performance of our regulariser through coordinate MLPs, where one can view the to-be-registered images as continuously differentiable entities. We demonstrate, through numerical and visual experiments, that our framework is able to outperform current techniques for image registration.

Mixed Modality Registration Methodology In Silico

LADDA: Latent Diffusion-based Domain-adaptive Feature Disentangling for Unsupervised Multi-modal Medical Image Registration.

Yuan P, Dong J, Zhao W, Lyu F, Xue C, Zhang Y, Yang C, Wu Z, Gao Z, Lyu T, Coatrieux JL, Chen Y

•papers•Jul 15 2025

Deformable image registration (DIR) is critical for accurate clinical diagnosis and effective treatment planning. However, patient movement, significant intensity differences, and large breathing deformations hinder accurate anatomical alignment in multi-modal image registration. These factors exacerbate the entanglement of anatomical and modality-specific style information, thereby severely limiting the performance of multi-modal registration. To address this, we propose a novel LAtent Diffusion-based Domain-Adaptive feature disentangling (LADDA) framework for unsupervised multi-modal medical image registration, which explicitly addresses the representation disentanglement. First, LADDA extracts reliable anatomical priors from the Latent Diffusion Model (LDM), facilitating downstream content-style disentangled learning. A Domain-Adaptive Feature Disentangling (DAFD) module is proposed to promote anatomical structure alignment further. This module disentangles image features into content and style information, boosting the network to focus on cross-modal content information. Next, a Neighborhood-Preserving Hashing (NPH) is constructed to further perceive and integrate hierarchical content information through local neighbourhood encoding, thereby maintaining cross-modal structural consistency. Furthermore, a Unilateral-Query-Frozen Attention (UQFA) module is proposed to enhance the coupling between upstream prior and downstream content information. The feature interaction within intra-domain consistent structures improves the fine recovery of detailed textures. The proposed framework is extensively evaluated on large-scale multi-center datasets, demonstrating superior performance across diverse clinical scenarios and strong generalization on out-of-distribution (OOD) data.

Mixed Modality Registration Methodology In Silico Academic Lab

Restore-RWKV: Efficient and Effective Medical Image Restoration with RWKV.

Yang Z, Li J, Zhang H, Zhao D, Wei B, Xu Y

•papers•Jul 15 2025

Transformers have revolutionized medical image restoration, but the quadratic complexity still poses limitations for their application to high-resolution medical images. The recent advent of the Receptance Weighted Key Value (RWKV) model in the natural language processing field has attracted much attention due to its ability to process long sequences efficiently. To leverage its advanced design, we propose Restore-RWKV, the first RWKV-based model for medical image restoration. Since the original RWKV model is designed for 1D sequences, we make two necessary modifications for modeling spatial relations in 2D medical images. First, we present a recurrent WKV (Re-WKV) attention mechanism that captures global dependencies with linear computational complexity. Re-WKV incorporates bidirectional attention as basic for a global 16 receptive field and recurrent attention to effectively model 2D dependencies from various scan directions. Second, we develop an omnidirectional token shift (Omni-Shift) layer that enhances local dependencies by shifting tokens from all directions and across a wide context range. These adaptations make the proposed Restore-RWKV an efficient and effective model for medical image restoration. Even a lightweight variant of Restore-RWKV, with only 1.16 million parameters, achieves comparable or even superior results compared to existing state-of-the-art (SOTA) methods. Extensive experiments demonstrate that the resulting Restore-RWKV achieves SOTA performance across a range of medical image restoration tasks, including PET image synthesis, CT image denoising, MRI image superresolution, and all-in-one medical image restoration. Code is available at: https://github.com/Yaziwel/Restore-RWKV.

Mixed Modality Reconstruction Methodology In Silico Academic Lab Open Code Benchmark SOTA

Filter Papers

Tags

Early Vascular Aging Determined by 3-Dimensional Aortic Geometry: Genetic Determinants and Clinical Consequences.

Cross-Modal conditional latent diffusion model for Brain MRI to Ultrasound image translation.

Multimodal Large Language Model With Knowledge Retrieval Using Flowchart Embedding for Forming Follow-Up Recommendations for Pancreatic Cystic Lesions.

Comparative Analysis of CNN Performance in Keras, PyTorch and JAX on PathMNIST

CytoSAE: Interpretable Cell Embeddings for Hematology

COLI: A Hierarchical Efficient Compressor for Large Images

A diffusion model for universal medical image enhancement.

Learning homeomorphic image registration via conformal-invariant hyperelastic regularisation.

LADDA: Latent Diffusion-based Domain-adaptive Feature Disentangling for Unsupervised Multi-modal Medical Image Registration.

Restore-RWKV: Efficient and Effective Medical Image Restoration with RWKV.

Ready to Sharpen Your Edge?