Sort by:
Page 60 of 84834 results

Tomographic Foundation Model -- FORCE: Flow-Oriented Reconstruction Conditioning Engine

Wenjun Xia, Chuang Niu, Ge Wang

arxiv logopreprintJun 2 2025
Computed tomography (CT) is a major medical imaging modality. Clinical CT scenarios, such as low-dose screening, sparse-view scanning, and metal implants, often lead to severe noise and artifacts in reconstructed images, requiring improved reconstruction techniques. The introduction of deep learning has significantly advanced CT image reconstruction. However, obtaining paired training data remains rather challenging due to patient motion and other constraints. Although deep learning methods can still perform well with approximately paired data, they inherently carry the risk of hallucination due to data inconsistencies and model instability. In this paper, we integrate the data fidelity with the state-of-the-art generative AI model, referred to as the Poisson flow generative model (PFGM) with a generalized version PFGM++, and propose a novel CT framework: Flow-Oriented Reconstruction Conditioning Engine (FORCE). In our experiments, the proposed method shows superior performance in various CT imaging tasks, outperforming existing unsupervised reconstruction approaches.

Beyond Pixel Agreement: Large Language Models as Clinical Guardrails for Reliable Medical Image Segmentation

Jiaxi Sheng, Leyi Yu, Haoyue Li, Yifan Gao, Xin Gao

arxiv logopreprintJun 2 2025
Evaluating AI-generated medical image segmentations for clinical acceptability poses a significant challenge, as traditional pixelagreement metrics often fail to capture true diagnostic utility. This paper introduces Hierarchical Clinical Reasoner (HCR), a novel framework that leverages Large Language Models (LLMs) as clinical guardrails for reliable, zero-shot quality assessment. HCR employs a structured, multistage prompting strategy that guides LLMs through a detailed reasoning process, encompassing knowledge recall, visual feature analysis, anatomical inference, and clinical synthesis, to evaluate segmentations. We evaluated HCR on a diverse dataset across six medical imaging tasks. Our results show that HCR, utilizing models like Gemini 2.5 Flash, achieved a classification accuracy of 78.12%, performing comparably to, and in instances exceeding, dedicated vision models such as ResNet50 (72.92% accuracy) that were specifically trained for this task. The HCR framework not only provides accurate quality classifications but also generates interpretable, step-by-step reasoning for its assessments. This work demonstrates the potential of LLMs, when appropriately guided, to serve as sophisticated evaluators, offering a pathway towards more trustworthy and clinically-aligned quality control for AI in medical imaging.

Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning

Yijun Yang, Zhao-Yang Wang, Qiuping Liu, Shuwen Sun, Kang Wang, Rama Chellappa, Zongwei Zhou, Alan Yuille, Lei Zhu, Yu-Dong Zhang, Jieneng Chen

arxiv logopreprintJun 2 2025
Providing effective treatment and making informed clinical decisions are essential goals of modern medicine and clinical care. We are interested in simulating disease dynamics for clinical decision-making, leveraging recent advances in large generative models. To this end, we introduce the Medical World Model (MeWM), the first world model in medicine that visually predicts future disease states based on clinical decisions. MeWM comprises (i) vision-language models to serve as policy models, and (ii) tumor generative models as dynamics models. The policy model generates action plans, such as clinical treatments, while the dynamics model simulates tumor progression or regression under given treatment conditions. Building on this, we propose the inverse dynamics model that applies survival analysis to the simulated post-treatment tumor, enabling the evaluation of treatment efficacy and the selection of the optimal clinical action plan. As a result, the proposed MeWM simulates disease dynamics by synthesizing post-treatment tumors, with state-of-the-art specificity in Turing tests evaluated by radiologists. Simultaneously, its inverse dynamics model outperforms medical-specialized GPTs in optimizing individualized treatment protocols across all metrics. Notably, MeWM improves clinical decision-making for interventional physicians, boosting F1-score in selecting the optimal TACE protocol by 13%, paving the way for future integration of medical world models as the second readers.

Artificial Intelligence-Driven Innovations in Diabetes Care and Monitoring

Abdul Rahman, S., Mahadi, M., Yuliana, D., Budi Susilo, Y. K., Ariffin, A. E., Amgain, K.

medrxiv logopreprintJun 2 2025
This study explores Artificial Intelligence (AI)s transformative role in diabetes care and monitoring, focusing on innovations that optimize patient outcomes. AI, particularly machine learning and deep learning, significantly enhances early detection of complications like diabetic retinopathy and improves screening efficacy. The methodology employs a bibliometric analysis using Scopus, VOSviewer, and Publish or Perish, analyzing 235 articles from 2023-2025. Results indicate a strong interdisciplinary focus, with Computer Science and Medicine being dominant subject areas (36.9% and 12.9% respectively). Bibliographic coupling reveals robust international collaborations led by the U.S. (1558.52 link strength), UK, and China, with key influential documents by Zhu (2023c) and Annuzzi (2023). This research highlights AIs impact on enhancing monitoring, personalized treatment, and proactive care, while acknowledging challenges in data privacy and ethical deployment. Future work should bridge technological advancements with real-world implementation to create equitable and efficient diabetes care systems.

Evaluating the performance and potential bias of predictive models for the detection of transthyretin cardiac amyloidosis

Hourmozdi, J., Easton, N., Benigeri, S., Thomas, J. D., Narang, A., Ouyang, D., Duffy, G., Upton, R., Hawkes, W., Akerman, A., Okwuosa, I., Kline, A., Kho, A. N., Luo, Y., Shah, S. J., Ahmad, F. S.

medrxiv logopreprintJun 2 2025
BackgroundDelays in the diagnosis of transthyretin amyloid cardiomyopathy (ATTR-CM) contribute to the significant morbidity of the condition, especially in the era of disease-modifying therapies. Screening for ATTR-CM with AI and other algorithms may improve timely diagnosis, but these algorithms have not been directly compared. ObjectivesThe aim of this study was to compare the performance of four algorithms for ATTR-CM detection in a heart failure population and assess the risk for harms due to model bias. MethodsWe identified patients in an integrated health system from 2010-2022 with ATTR-CM and age- and sex-matched them to controls with heart failure to target 5% prevalence. We compared the performance of a claims-based random forest model (Huda et al. model), a regression-based score (Mayo ATTR-CM), and two deep learning echo models (EchoNet-LVH and EchoGo(R) Amyloidosis). We evaluated for bias using standard fairness metrics. ResultsThe analytical cohort included 176 confirmed cases of ATTR-CM and 3192 control patients with 79.2% self-identified as White and 9.0% as Black. The Huda et al. model performed poorly (AUC 0.49). Both deep learning echo models had a higher AUC when compared to the Mayo ATTR-CM Score (EchoNet-LVH 0.88; EchoGo Amyloidosis 0.92; Mayo ATTR-CM Score 0.79; DeLong P<0.001 for both). Bias auditing met fairness criteria for equal opportunity among patients who identified as Black. ConclusionsDeep learning, echo-based models to detect ATTR-CM demonstrated best overall discrimination when compared to two other models in external validation with low risk of harms due to racial bias.

Synthetic Ultrasound Image Generation for Breast Cancer Diagnosis Using cVAE-WGAN Models: An Approach Based on Generative Artificial Intelligence

Mondillo, G., Masino, M., Colosimo, S., Perrotta, A., Frattolillo, V., Abbate, F. G.

medrxiv logopreprintJun 2 2025
The scarcity and imbalance of medical image datasets hinder the development of robust computer-aided diagnosis (CAD) systems for breast cancer. This study explores the application of advanced generative models, based on generative artificial intelligence (GenAI), for the synthesis of digital breast ultrasound images. Using a hybrid Conditional Variational Autoencoder-Wasserstein Generative Adversarial Network (CVAE-WGAN) architecture, we developed a system to generate high-quality synthetic images conditioned on the class (malignant vs. normal/benign). These synthetic images, generated from the low-resolution BreastMNIST dataset and filtered for quality, were systematically integrated with real training data at different mixing ratios (W). The performance of a CNN classifier trained on these mixed datasets was evaluated against a baseline model trained only on real data balanced with SMOTE. The optimal integration (mixing weight W=0.25) produced a significant performance increase on the real test set: +8.17% in macro-average F1-score and +4.58% in accuracy compared to using real data alone. Analysis confirmed the originality of the generated samples. This approach offers a promising solution for overcoming data limitations in image-based breast cancer diagnostics, potentially improving the capabilities of CAD systems.

A Comparative Performance Analysis of Regular Expressions and an LLM-Based Approach to Extract the BI-RADS Score from Radiological Reports

Dennstaedt, F., Lerch, L., Schmerder, M., Cihoric, N., Cerghetti, G. M., Gaio, R., Bonel, H., Filchenko, I., Hastings, J., Dammann, F., Aebersold, D. M., von Tengg, H., Nairz, K.

medrxiv logopreprintJun 2 2025
BackgroundDifferent Natural Language Processing (NLP) techniques have demonstrated promising results for data extraction from radiological reports. Both traditional rule-based methods like regular expressions (Regex) and modern Large Language Models (LLMs) can extract structured information. However, comparison between these approaches for extraction of specific radiological data elements has not been widely conducted. MethodsWe compared accuracy and processing time between Regex and LLM-based approaches for extracting BI-RADS scores from 7,764 radiology reports (mammography, ultrasound, MRI, and biopsy). We developed a rule-based algorithm using Regex patterns and implemented an LLM-based extraction using the Rombos-LLM-V2.6-Qwen-14b model. A ground truth dataset of 199 manually classified reports was used for evaluation. ResultsThere was no statistically significant difference in the accuracy in extracting BI-RADS scores between Regex and an LLM-based method (accuracy of 89.20% for Regex versus 87.69% for the LLM-based method; p=0.56). Compared to the LLM-based method, Regex processing was more efficient, completing the task 28,120 times faster (0.06 seconds vs. 1687.20 seconds). Further analysis revealed LLMs favored common classifications (particularly BI-RADS value of 2) while Regex more frequently returned "unclear" values. We also could confirm in our sample an already known laterality bias for breast cancer (BI-RADS 6) and detected a slight laterality skew for suspected breast cancer (BI-RADS 5) as well. ConclusionFor structured, standardized data like BI-RADS, traditional NLP techniques seem to be superior, though future work should explore hybrid approaches combining Regex precision for standardized elements with LLM contextual understanding for more complex information extraction tasks.

Revolutionizing Radiology Workflow with Factual and Efficient CXR Report Generation

Pimchanok Sukjai, Apiradee Boonmee

arxiv logopreprintJun 1 2025
The escalating demand for medical image interpretation underscores the critical need for advanced artificial intelligence solutions to enhance the efficiency and accuracy of radiological diagnoses. This paper introduces CXR-PathFinder, a novel Large Language Model (LLM)-centric foundation model specifically engineered for automated chest X-ray (CXR) report generation. We propose a unique training paradigm, Clinician-Guided Adversarial Fine-Tuning (CGAFT), which meticulously integrates expert clinical feedback into an adversarial learning framework to mitigate factual inconsistencies and improve diagnostic precision. Complementing this, our Knowledge Graph Augmentation Module (KGAM) acts as an inference-time safeguard, dynamically verifying generated medical statements against authoritative knowledge bases to minimize hallucinations and ensure standardized terminology. Leveraging a comprehensive dataset of millions of paired CXR images and expert reports, our experiments demonstrate that CXR-PathFinder significantly outperforms existing state-of-the-art medical vision-language models across various quantitative metrics, including clinical accuracy (Macro F1 (14): 46.5, Micro F1 (14): 59.5). Furthermore, blinded human evaluation by board-certified radiologists confirms CXR-PathFinder's superior clinical utility, completeness, and accuracy, establishing its potential as a reliable and efficient aid for radiological practice. The developed method effectively balances high diagnostic fidelity with computational efficiency, providing a robust solution for automated medical report generation.

Aiding Medical Diagnosis through Image Synthesis and Classification

Kanishk Choudhary

arxiv logopreprintJun 1 2025
Medical professionals, especially those in training, often depend on visual reference materials to support an accurate diagnosis and develop pattern recognition skills. However, existing resources may lack the diversity and accessibility needed for broad and effective clinical learning. This paper presents a system designed to generate realistic medical images from textual descriptions and validate their accuracy through a classification model. A pretrained stable diffusion model was fine-tuned using Low-Rank Adaptation (LoRA) on the PathMNIST dataset, consisting of nine colorectal histopathology tissue types. The generative model was trained multiple times using different training parameter configurations, guided by domain-specific prompts to capture meaningful features. To ensure quality control, a ResNet-18 classification model was trained on the same dataset, achieving 99.76% accuracy in detecting the correct label of a colorectal histopathological medical image. Generated images were then filtered using the trained classifier and an iterative process, where inaccurate outputs were discarded and regenerated until they were correctly classified. The highest performing version of the generative model from experimentation achieved an F1 score of 0.6727, with precision and recall scores of 0.6817 and 0.7111, respectively. Some types of tissue, such as adipose tissue and lymphocytes, reached perfect classification scores, while others proved more challenging due to structural complexity. The self-validating approach created demonstrates a reliable method for synthesizing domain-specific medical images because of high accuracy in both the generation and classification portions of the system, with potential applications in both diagnostic support and clinical education. Future work includes improving prompt-specific accuracy and extending the system to other areas of medical imaging.

Modality Translation and Registration of MR and Ultrasound Images Using Diffusion Models

Xudong Ma, Nantheera Anantrasirichai, Stefanos Bolomytis, Alin Achim

arxiv logopreprintJun 1 2025
Multimodal MR-US registration is critical for prostate cancer diagnosis. However, this task remains challenging due to significant modality discrepancies. Existing methods often fail to align critical boundaries while being overly sensitive to irrelevant details. To address this, we propose an anatomically coherent modality translation (ACMT) network based on a hierarchical feature disentanglement design. We leverage shallow-layer features for texture consistency and deep-layer features for boundary preservation. Unlike conventional modality translation methods that convert one modality into another, our ACMT introduces the customized design of an intermediate pseudo modality. Both MR and US images are translated toward this intermediate domain, effectively addressing the bottlenecks faced by traditional translation methods in the downstream registration task. Experiments demonstrate that our method mitigates modality-specific discrepancies while preserving crucial anatomical boundaries for accurate registration. Quantitative evaluations show superior modality similarity compared to state-of-the-art modality translation methods. Furthermore, downstream registration experiments confirm that our translated images achieve the best alignment performance, highlighting the robustness of our framework for multi-modal prostate image registration.
Page 60 of 84834 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.