Sort by:
Page 19 of 78779 results

FedVGM: Enhancing Federated Learning Performance on Multi-Dataset Medical Images with XAI.

Tahosin MS, Sheakh MA, Alam MJ, Hassan MM, Bairagi AK, Abdulla S, Alshathri S, El-Shafai W

pubmed logopapersAug 20 2025
Advances in deep learning have transformed medical imaging, yet progress is hindered by data privacy regulations and fragmented datasets across institutions. To address these challenges, we propose FedVGM, a privacy-preserving federated learning framework for multi-modal medical image analysis. FedVGM integrates four imaging modalities, including brain MRI, breast ultrasound, chest X-ray, and lung CT, across 14 diagnostic classes without centralizing patient data. Using transfer learning and an ensemble of VGG16 and MobileNetV2, FedVGM achieves 97.7% $\pm$ 0.01 accuracy on the combined dataset and 91.9-99.1% across individual modalities. We evaluated three aggregation strategies and demonstrated median aggregation to be the most effective. To ensure clinical interpretability, we apply explainable AI techniques and validate results through performance metrics, statistical analysis, and k-fold cross-validation. FedVGM offers a robust, scalable solution for collaborative medical diagnostics, supporting clinical deployment while preserving data privacy.

Applying large language model for automated quality scoring of radiology requisitions using a standardized criteria.

Büyüktoka RE, Surucu M, Erekli Derinkaya PB, Adibelli ZH, Salbas A, Koc AM, Buyuktoka AD, Isler Y, Ugur MA, Isiklar E

pubmed logopapersAug 20 2025
To create and test a locally adapted large language model (LLM) for automated scoring of radiology requisitions based on the reason for exam imaging reporting and data system (RI-RADS), and to evaluate its performance based on reference standards. This retrospective, double-center study included 131,683 radiology requisitions from two institutions. A bidirectional encoder representation from a transformer (BERT)-based model was trained using 101,563 requisitions from Center 1 (including 1500 synthetic examples) and externally tested on 18,887 requisitions from Center 2. The model's performance for two different classification strategies was evaluated by the reference standard created by three different radiologists. Model performance was assessed using Cohen's Kappa, accuracy, F1-score, sensitivity, and specificity with 95% confidence intervals. A total of 18,887 requisitions were evaluated for the external test set. External testing yielded a performance with an F1-score of 0.93 (95% CI: 0.912-0.943); κ = 0.88 (95% CI: 0.871-0.884). Performance was highest in common categories RI-RADS D and X (F1 ≥ 0.96) and lowest for rare categories RI-RADS A and B (F1 ≤ 0.49). When grouped into three categories (adequate, inadequate, and unacceptable), overall model performance improved [F1-score = 0.97; (95% CI: 0.96-0.97)]. The locally adapted BERT-based model demonstrated high performance and almost perfect agreement with radiologists in automated RI-RADS scoring, showing promise for integration into radiology workflows to improve requisition completeness and communication. Question Can an LLM accurately and automatically score radiology requisitions based on standardized criteria to address the challenges of incomplete information in radiological practice? Findings A locally adapted BERT-based model demonstrated high performance (F1-score 0.93) and almost perfect agreement with radiologists in automated RI-RADS scoring across a large, multi-institutional dataset. Clinical relevance LLMs offer a scalable solution for automated scoring of radiology requisitions, with the potential to improve workflow in radiology. Further improvement and integration into clinical practice could enhance communication, contributing to better diagnoses and patient care.

Review of GPU-based Monte Carlo simulation platforms for transmission and emission tomography in medicine.

Chi Y, Schubert KE, Badal A, Roncali E

pubmed logopapersAug 20 2025
Monte Carlo (MC) simulation remains the gold standard for modeling complex physical interactions in transmission and emission tomography, with GPU parallel computing offering unmatched computational performance and enabling practical, large-scale MC applications. In recent years, rapid advancements in both GPU technologies and tomography techniques have been observed. Harnessing emerging GPU capabilities to accelerate MC simulation and strengthen its role in supporting the rapid growth of medical tomography has become an important topic. To provide useful insights, we conducted a comprehensive review of state-of-the-art GPU-accelerated MC simulations in tomography, highlighting current achievements and underdeveloped areas.

Approach: We reviewed key technical developments across major tomography modalities, including computed tomography (CT), cone-beam CT (CBCT), positron emission tomography, single-photon emission computed tomography, proton CT, emerging techniques, and hybrid modalities. We examined MC simulation methods and major CPU-based MC platforms that have historically supported medical imaging development, followed by a review of GPU acceleration strategies, hardware evolutions, and leading GPU-based MC simulation packages. Future development directions were also discussed.

Main Results: Significant advancements have been achieved in both tomography and MC simulation technologies over the past half-century. The introduction of GPUs has enabled speedups often exceeding 100-1000 times over CPU implementations, providing essential support to the development of new imaging systems. Emerging GPU features like ray-tracing cores, tensor cores, and GPU-execution-friendly transport methods offer further opportunities for performance enhancement. 

Significance: GPU-based MC simulation is expected to remain essential in advancing medical emission and transmission tomography. With the emergence of new concepts such as training Machine Learning with synthetic data, Digital Twins for Healthcare, and Virtual Clinical Trials, improving hardware portability and modularizing GPU-based MC codes to adapt to these evolving simulation needs represent important future research directions. This review aims to provide useful insights for researchers, developers, and practitioners in relevant fields.

Potential and challenges of generative adversarial networks for super-resolution in 4D Flow MRI

Oliver Welin Odeback, Arivazhagan Geetha Balasubramanian, Jonas Schollenberger, Edward Ferdiand, Alistair A. Young, C. Alberto Figueroa, Susanne Schnell, Outi Tammisola, Ricardo Vinuesa, Tobias Granberg, Alexander Fyrdahl, David Marlevi

arxiv logopreprintAug 20 2025
4D Flow Magnetic Resonance Imaging (4D Flow MRI) enables non-invasive quantification of blood flow and hemodynamic parameters. However, its clinical application is limited by low spatial resolution and noise, particularly affecting near-wall velocity measurements. Machine learning-based super-resolution has shown promise in addressing these limitations, but challenges remain, not least in recovering near-wall velocities. Generative adversarial networks (GANs) offer a compelling solution, having demonstrated strong capabilities in restoring sharp boundaries in non-medical super-resolution tasks. Yet, their application in 4D Flow MRI remains unexplored, with implementation challenged by known issues such as training instability and non-convergence. In this study, we investigate GAN-based super-resolution in 4D Flow MRI. Training and validation were conducted using patient-specific cerebrovascular in-silico models, converted into synthetic images via an MR-true reconstruction pipeline. A dedicated GAN architecture was implemented and evaluated across three adversarial loss functions: Vanilla, Relativistic, and Wasserstein. Our results demonstrate that the proposed GAN improved near-wall velocity recovery compared to a non-adversarial reference (vNRMSE: 6.9% vs. 9.6%); however, that implementation specifics are critical for stable network training. While Vanilla and Relativistic GANs proved unstable compared to generator-only training (vNRMSE: 8.1% and 7.8% vs. 7.2%), a Wasserstein GAN demonstrated optimal stability and incremental improvement (vNRMSE: 6.9% vs. 7.2%). The Wasserstein GAN further outperformed the generator-only baseline at low SNR (vNRMSE: 8.7% vs. 10.7%). These findings highlight the potential of GAN-based super-resolution in enhancing 4D Flow MRI, particularly in challenging cerebrovascular regions, while emphasizing the need for careful selection of adversarial strategies.

Multi-View Echocardiographic Embedding for Accessible AI Development

Tohyama, T., Han, A., Yoon, D., Paik, K., Gow, B., Izath, N., Kpodonu, J., Celi, L. A.

medrxiv logopreprintAug 19 2025
Background and AimsEchocardiography serves as a cornerstone of cardiovascular diagnostics through multiple standardized imaging views. While recent AI foundation models demonstrate superior capabilities across cardiac imaging tasks, their massive computational requirements and reliance on large-scale datasets create accessibility barriers, limiting AI development to well-resourced institutions. Vector embedding approaches offer promising solutions by leveraging compact representations from original medical images for downstream applications. Furthermore, demographic fairness remains critical, as AI models may incorporate biases that confound clinically relevant features. We developed a multi-view encoder framework to address computational accessibility while investigating demographic fairness challenges. MethodsWe utilized the MIMIC-IV-ECHO dataset (7,169 echocardiographic studies) to develop a transformer-based multi-view encoder that aggregates view-level representations into study-level embeddings. The framework incorporated adversarial learning to suppress demographic information while maintaining clinical performance. We evaluated performance across 21 binary classification tasks encompassing echocardiographic measurements and clinical diagnoses, comparing against foundation model baselines with varying adversarial weights. ResultsThe multi-view encoder achieved a mean improvement of 9.0 AUC points (12.0% relative improvement) across clinical tasks compared to foundation model embeddings. Performance remained robust with limited echocardiographic views compared to the conventional approach. However, adversarial learning showed limited effectiveness in reducing demographic shortcuts, with stronger weighting substantially compromising diagnostic performance. ConclusionsOur framework democratizes advanced cardiac AI capabilities, enabling substantial diagnostic improvements without massive computational infrastructure. While algorithmic approaches to demographic fairness showed limitations, the multi-view encoder provides a practical pathway for broader AI adoption in cardiovascular medicine with enhanced efficiency in real-world clinical settings. Structured graphical abstract or graphical abstractO_ST_ABSKey QuestionC_ST_ABSCan multi-view encoder frameworks achieve superior diagnostic performance compared to foundation model embeddings while reducing computational requirements and maintaining robust performance with fewer echocardiographic views for cardiac AI applications? Key FindingMulti-view encoder achieved 12.0% relative improvement (9.0 AUC points) across 21 cardiac tasks compared to foundation model baselines, with efficient 512-dimensional vector embeddings and robust performance using fewer echocardiographic views. Take-home MessageVector embedding approaches with attention-based multi-view integration significantly improve cardiac diagnostic performance while reducing computational requirements, offering a pathway toward more efficient AI implementation in clinical settings. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=83 SRC="FIGDIR/small/25333725v1_ufig1.gif" ALT="Figure 1"> View larger version (22K): [email protected]@a75818org.highwire.dtl.DTLVardef@88a588org.highwire.dtl.DTLVardef@12bad06_HPS_FORMAT_FIGEXP M_FIG C_FIG Translational PerspectiveOur proposed multi-view encoder framework overcomes critical barriers to the widespread adoption of artificial intelligence in echocardiography. By dramatically reducing computational requirements, the multi-view encoder approach allows smaller healthcare institutions to develop sophisticated AI models locally. The framework maintains robust performance with fewer echocardiographic examinations, which addresses real-world clinical constraints where comprehensive imaging is not feasible due to patient factors or time limitations. This technology provides a practical way to democratize advanced cardiac AI capabilities, which could improve access to cardiovascular care across diverse healthcare settings while reducing dependence on proprietary datasets and massive computational resources.

A Systematic Study of Deep Learning Models and xAI Methods for Region-of-Interest Detection in MRI Scans

Justin Yiu, Kushank Arora, Daniel Steinberg, Rohit Ghiya

arxiv logopreprintAug 19 2025
Magnetic Resonance Imaging (MRI) is an essential diagnostic tool for assessing knee injuries. However, manual interpretation of MRI slices remains time-consuming and prone to inter-observer variability. This study presents a systematic evaluation of various deep learning architectures combined with explainable AI (xAI) techniques for automated region of interest (ROI) detection in knee MRI scans. We investigate both supervised and self-supervised approaches, including ResNet50, InceptionV3, Vision Transformers (ViT), and multiple U-Net variants augmented with multi-layer perceptron (MLP) classifiers. To enhance interpretability and clinical relevance, we integrate xAI methods such as Grad-CAM and Saliency Maps. Model performance is assessed using AUC for classification and PSNR/SSIM for reconstruction quality, along with qualitative ROI visualizations. Our results demonstrate that ResNet50 consistently excels in classification and ROI identification, outperforming transformer-based models under the constraints of the MRNet dataset. While hybrid U-Net + MLP approaches show potential for leveraging spatial features in reconstruction and interpretability, their classification performance remains lower. Grad-CAM consistently provided the most clinically meaningful explanations across architectures. Overall, CNN-based transfer learning emerges as the most effective approach for this dataset, while future work with larger-scale pretraining may better unlock the potential of transformer models.

Emerging modalities for neuroprognostication in neonatal encephalopathy: harnessing the potential of artificial intelligence.

Chawla V, Cizmeci MN, Sullivan KM, Gritz EC, Q Cardona V, Menkiti O, Natarajan G, Rao R, McAdams RM, Dizon ML

pubmed logopapersAug 19 2025
Neonatal Encephalopathy (NE) from presumed hypoxic-ischemic encephalopathy (pHIE) is a leading cause of morbidity and mortality in infants worldwide. Recent advancements in HIE research have introduced promising tools for improved screening of high-risk infants, time to diagnosis, and accuracy of assessment of neurologic injury to guide management and predict outcomes, some of which integrate artificial intelligence (AI) and machine learning (ML). This review begins with an overview of AI/ML before examining emerging prognostic approaches for predicting outcomes in pHIE. It explores various modalities including placental and fetal biomarkers, gene expression, electroencephalography, brain magnetic resonance imaging and other advanced neuroimaging techniques, clinical video assessment tools, and transcranial magnetic stimulation paired with electromyography. Each of these approaches may come to play a crucial role in predicting outcomes in pHIE. We also discuss the application of AI/ML to enhance these emerging prognostic tools. While further validation is needed for widespread clinical adoption, these tools and their multimodal integration hold the potential to better leverage neuroplasticity windows of affected infants. IMPACT: This article provides an overview of placental pathology, biomarkers, gene expression, electroencephalography, motor assessments, brain imaging, and transcranial magnetic stimulation tools for long-term neurodevelopmental outcome prediction following neonatal encephalopathy, that lend themselves to augmentation by artificial intelligence/machine learning (AI/ML). Emerging AI/ML tools may create opportunities for enhanced prognostication through multimodal analyses.

A Systematic Study of Deep Learning Models and xAI Methods for Region-of-Interest Detection in MRI Scans

Justin Yiu, Kushank Arora, Daniel Steinberg, Rohit Ghiya

arxiv logopreprintAug 19 2025
Magnetic Resonance Imaging (MRI) is an essential diagnostic tool for assessing knee injuries. However, manual interpretation of MRI slices remains time-consuming and prone to inter-observer variability. This study presents a systematic evaluation of various deep learning architectures combined with explainable AI (xAI) techniques for automated region of interest (ROI) detection in knee MRI scans. We investigate both supervised and self-supervised approaches, including ResNet50, InceptionV3, Vision Transformers (ViT), and multiple U-Net variants augmented with multi-layer perceptron (MLP) classifiers. To enhance interpretability and clinical relevance, we integrate xAI methods such as Grad-CAM and Saliency Maps. Model performance is assessed using AUC for classification and PSNR/SSIM for reconstruction quality, along with qualitative ROI visualizations. Our results demonstrate that ResNet50 consistently excels in classification and ROI identification, outperforming transformer-based models under the constraints of the MRNet dataset. While hybrid U-Net + MLP approaches show potential for leveraging spatial features in reconstruction and interpretability, their classification performance remains lower. Grad-CAM consistently provided the most clinically meaningful explanations across architectures. Overall, CNN-based transfer learning emerges as the most effective approach for this dataset, while future work with larger-scale pretraining may better unlock the potential of transformer models.

A Multimodal Large Language Model as an End-to-End Classifier of Thyroid Nodule Malignancy Risk: Usability Study.

Sng GGR, Xiang Y, Lim DYZ, Tung JYM, Tan JH, Chng CL

pubmed logopapersAug 19 2025
Thyroid nodules are common, with ultrasound imaging as the primary modality for their assessment. Risk stratification systems like the American College of Radiology Thyroid Imaging Reporting and Data System (ACR TI-RADS) have been developed but suffer from interobserver variability and low specificity. Artificial intelligence, particularly large language models (LLMs) with multimodal capabilities, presents opportunities for efficient end-to-end diagnostic processes. However, their clinical utility remains uncertain. This study evaluates the accuracy and consistency of multimodal LLMs for thyroid nodule risk stratification using the ACR TI-RADS system, examining the effects of model fine-tuning, image annotation, prompt engineering, and comparing open-source versus commercial models. In total, 3 multimodal vision-language models were evaluated: Microsoft's open-source Large Language and Visual Assistant (LLaVA) model, its medically fine-tuned variant (Large Language and Vision Assistant for bioMedicine [LLaVA-Med]), and OpenAI's commercial o3 model. A total of 192 thyroid nodules from publicly available ultrasound image datasets were assessed. Each model was evaluated using 2 prompts (basic and modified) and 2 image scenarios (unlabeled vs radiologist-annotated), yielding 6912 responses. Model outputs were compared with expert ratings for accuracy and consistency. Statistical comparisons included Chi-square tests, Mann-Whitney U tests, and Fleiss' kappa for interrater reliability. Overall, 88.4% (6110/6912) of responses were valid, with the o3 model producing the highest validity rate (2273/2304, 98.6%), followed by LLaVA (2108/2304, 91.5%) and LLaVA-Med (1729/2304, 75%; P<.001). The o3 model demonstrated the highest accuracy overall, achieving up to 57.3% accuracy in Thyroid Imaging Reporting and Data System (TI-RADS) classification, although still remaining suboptimal. Labeled images improved accuracy marginally in nodule margin assessment only when evaluating LLaVA models (407/768, 53% to 447/768, 58.2%; P=.04). Prompt engineering improved accuracy for composition (649/1,152, 56.3% vs 483/1152, 41.9%; P<.001), but significantly reduced accuracy for shape, margins, and overall classification. Consistency was the highest with the o3 model (up to 85.4%), but was comparable for LLaVA and significantly improved with image labeling and modified prompts across multiple TI-RADS categories (P<.001). Subgroup analysis for o3 alone showed prompt engineering did not affect accuracy significantly but markedly improved consistency across all TI-RADS categories (up to 97.1% for shape, P<.001). Interrater reliability was consistently poor across all combinations (Fleiss' kappa<0.60). The study demonstrates the comparative advantages and limitations of multimodal LLMs for thyroid nodule risk stratification. While the commercial model (o3) consistently outperformed open-source models in accuracy and consistency, even the best-performing model outputs remained suboptimal for direct clinical deployment. Prompt engineering significantly enhanced output consistency, particularly in the commercial model. These findings underline the importance of strategic model optimization techniques and highlight areas requiring further development before multimodal LLMs can be reliably used in clinical thyroid imaging workflows.

Ferroelectric/Antiferroelectric HfZrO<sub><i>x</i></sub> Artificial Synapses/Neurons for Convolutional Neural Network-Spiking Neural Network Neuromorphic Computing.

Zhang J, Xu K, Lu L, Lu C, Tao X, Liu Y, Yu J, Meng J, Zhang DW, Wang T, Chen L

pubmed logopapersAug 19 2025
Brain-inspired neuromorphic computing offers significant potential for efficient and adaptive computational platforms. Emerging ferroelectric and antiferroelectric HfZrO<sub><i>x</i></sub> devices provide key roles in convolutional neural network (CNN) and spiking neural network (SNN) computing with unique polarization switching characteristics. Here, we present ferroelectric/antiferroelectric HfZrO<sub><i>x</i></sub> devices to realize functions of artificial synapse/neurons by element doping engineering. The HfZrO<sub><i>x</i></sub>-based ferroelectric and antiferroelectric devices exhibit excellent endurance characteristics of 1 × 10<sup>9</sup> cycles. Based on the non-volatile polarization switching and spontaneous depolarization nature of ferroelectric and antiferroelectric devices, integrate-and-fire behaviors were constructed for neuromorphic computing. For the first time, a complementary ferroelectric/antiferroelectric HfZrO<sub><i>x</i></sub> artificial synapse/neuron-based hybrid CNN-SNN framework was constructed for energy-efficient cardiac magnetic resonance imaging (MRI) classification. The hybrid neural network breaks the limitation of pure SNN in 3D image recognition and improves the accuracy from 82.3 to 92.7% compared to pure CNN, highlighting the potential of composition-engineered ferroelectric materials to implement high-efficiency neuromorphic computing.
Page 19 of 78779 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.