Sort by:
Page 5 of 81804 results

From dictation to diagnosis: enhancing radiology reporting with integrated speech recognition in multimodal large language models.

Gertz RJ, Beste NC, Dratsch T, Lennartz S, Bremm J, Iuga AI, Bunck AC, Laukamp KR, Schönfeld M, Kottlors J

pubmed logopapersAug 15 2025
This study evaluates the efficiency, accuracy, and cost-effectiveness of radiology reporting using audio multimodal large language models (LLMs) compared to conventional reporting with speech recognition software. We hypothesized that providing minimal audio input would enable a multimodal LLM to generate complete radiological reports. 480 reports from 80 retrospective multimodal imaging studies were reported by two board-certified radiologists using three workflows: conventional workflow (C-WF) with speech recognition software to generate findings and impressions separately and LLM-based workflow (LLM-WF) using the state-of-the-art LLMs GPT-4o and Claude Sonnet 3.5. Outcome measures included reporting time, corrections and personnel cost per report. Two radiologists assessed formal structure and report quality. Statistical analysis used ANOVA and Tukey's post hoc tests (p < 0.05). LLM-WF significantly reduced reporting time (GPT-4o/Sonnet 3.5: 38.9 s ± 22.7 s vs. C-WF: 88.0 s ± 60.9 s, p < 0.01), required fewer corrections (GPT-4o: 1.0 ± 1.1, Sonnet 3.5: 0.9 ± 1.0 vs. C-WF: 2.4 ± 2.5, p < 0.01), and lowered costs (GPT-4o: $2.3 ± $1.4, Sonnet 3.5: $2.4 ± $1.4 vs. C-WF: $3.0 ± $2.1, p < 0.01). Reports generated with Sonnet 3.5 were rated highest in quality, while GPT-4o and conventional reports showed no difference. Multimodal LLMs can generate high-quality radiology reports based solely on minimal audio input, with greater speed, fewer corrections, and reduced costs compared to conventional speech-based workflows. However, future implementation may involve licensing costs, and generalizability to broader clinical contexts warrants further evaluation. Question Comparing time, accuracy, cost, and report quality of reporting using audio input functionality of GPT-4o and Claude Sonnet 3.5 to conventional reporting with speech recognition. Findings Large language models enable radiological reporting via minimal audio input, reducing turnaround time and costs without quality loss compared to conventional reporting with speech recognition. Clinical relevance Large language model-based reporting from minimal audio input has the potential to improve efficiency and report quality, supporting more streamlined workflows in clinical radiology.

FusionFM: Fusing Eye-specific Foundational Models for Optimized Ophthalmic Diagnosis

Ke Zou, Jocelyn Hui Lin Goh, Yukun Zhou, Tian Lin, Samantha Min Er Yew, Sahana Srinivasan, Meng Wang, Rui Santos, Gabor M. Somfai, Huazhu Fu, Haoyu Chen, Pearse A. Keane, Ching-Yu Cheng, Yih Chung Tham

arxiv logopreprintAug 15 2025
Foundation models (FMs) have shown great promise in medical image analysis by improving generalization across diverse downstream tasks. In ophthalmology, several FMs have recently emerged, but there is still no clear answer to fundamental questions: Which FM performs the best? Are they equally good across different tasks? What if we combine all FMs together? To our knowledge, this is the first study to systematically evaluate both single and fused ophthalmic FMs. To address these questions, we propose FusionFM, a comprehensive evaluation suite, along with two fusion approaches to integrate different ophthalmic FMs. Our framework covers both ophthalmic disease detection (glaucoma, diabetic retinopathy, and age-related macular degeneration) and systemic disease prediction (diabetes and hypertension) based on retinal imaging. We benchmarked four state-of-the-art FMs (RETFound, VisionFM, RetiZero, and DINORET) using standardized datasets from multiple countries and evaluated their performance using AUC and F1 metrics. Our results show that DINORET and RetiZero achieve superior performance in both ophthalmic and systemic disease tasks, with RetiZero exhibiting stronger generalization on external datasets. Regarding fusion strategies, the Gating-based approach provides modest improvements in predicting glaucoma, AMD, and hypertension. Despite these advances, predicting systemic diseases, especially hypertension in external cohort remains challenging. These findings provide an evidence-based evaluation of ophthalmic FMs, highlight the benefits of model fusion, and point to strategies for enhancing their clinical applicability.

FIND-Net -- Fourier-Integrated Network with Dictionary Kernels for Metal Artifact Reduction

Farid Tasharofi, Fuxin Fan, Melika Qahqaie, Mareike Thies, Andreas Maier

arxiv logopreprintAug 14 2025
Metal artifacts, caused by high-density metallic implants in computed tomography (CT) imaging, severely degrade image quality, complicating diagnosis and treatment planning. While existing deep learning algorithms have achieved notable success in Metal Artifact Reduction (MAR), they often struggle to suppress artifacts while preserving structural details. To address this challenge, we propose FIND-Net (Fourier-Integrated Network with Dictionary Kernels), a novel MAR framework that integrates frequency and spatial domain processing to achieve superior artifact suppression and structural preservation. FIND-Net incorporates Fast Fourier Convolution (FFC) layers and trainable Gaussian filtering, treating MAR as a hybrid task operating in both spatial and frequency domains. This approach enhances global contextual understanding and frequency selectivity, effectively reducing artifacts while maintaining anatomical structures. Experiments on synthetic datasets show that FIND-Net achieves statistically significant improvements over state-of-the-art MAR methods, with a 3.07% MAE reduction, 0.18% SSIM increase, and 0.90% PSNR improvement, confirming robustness across varying artifact complexities. Furthermore, evaluations on real-world clinical CT scans confirm FIND-Net's ability to minimize modifications to clean anatomical regions while effectively suppressing metal-induced distortions. These findings highlight FIND-Net's potential for advancing MAR performance, offering superior structural preservation and improved clinical applicability. Code is available at https://github.com/Farid-Tasharofi/FIND-Net

Restorative artificial intelligence-driven implant dentistry for immediate implant placement with an interim crown: A clinical report.

Marques VR, Soh D, Cerqueira G, Orgev A

pubmed logopapersAug 14 2025
Immediate implant placement into the extraction socket based on a restoratively driven approach poses challenges which might compromise the delivery of an immediate interim restoration on the day of surgery. The fabricated digital design of the interim restoration may require modification before delivery and may not maintain the planned form to support the gingival architecture for the future prosthetic volume for the emergence profile. This report demonstrates how to utilize the artificial intelligence (AI)-assisted segmentation of bone and tooth to enhance restoratively driven planning for immediate implant placement with an immediate interim restoration. A fractured maxillary central incisor was extracted after cone beam computed tomography (CBCT) analysis. AI-assisted segmentation from the digital imaging and communications in medicine (DICOM) file was used to separate the tooth segmentation and alveolar bone for the digital implant planning and AI-assisted design of the interim restoration copied from the natural tooth contour, optimizing the emergence profile. Immediate implant placement was completed after minimally traumatic extraction, and the AI-assisted interim restoration was delivered immediately. The AI-assisted workflow enabled predictable implant positioning based on restorative needs, reducing surgical time and optimizing delivery of the interim restoration for improved clinical outcomes. The emergence profile of the anatomic crown copied from the AI-workflow for the interim restoration guided soft tissue healing effectively.

Cross-view Generalized Diffusion Model for Sparse-view CT Reconstruction

Jixiang Chen, Yiqun Lin, Yi Qin, Hualiang Wang, Xiaomeng Li

arxiv logopreprintAug 14 2025
Sparse-view computed tomography (CT) reduces radiation exposure by subsampling projection views, but conventional reconstruction methods produce severe streak artifacts with undersampled data. While deep-learning-based methods enable single-step artifact suppression, they often produce over-smoothed results under significant sparsity. Though diffusion models improve reconstruction via iterative refinement and generative priors, they require hundreds of sampling steps and struggle with stability in highly sparse regimes. To tackle these concerns, we present the Cross-view Generalized Diffusion Model (CvG-Diff), which reformulates sparse-view CT reconstruction as a generalized diffusion process. Unlike existing diffusion approaches that rely on stochastic Gaussian degradation, CvG-Diff explicitly models image-domain artifacts caused by angular subsampling as a deterministic degradation operator, leveraging correlations across sparse-view CT at different sample rates. To address the inherent artifact propagation and inefficiency of sequential sampling in generalized diffusion model, we introduce two innovations: Error-Propagating Composite Training (EPCT), which facilitates identifying error-prone regions and suppresses propagated artifacts, and Semantic-Prioritized Dual-Phase Sampling (SPDPS), an adaptive strategy that prioritizes semantic correctness before detail refinement. Together, these innovations enable CvG-Diff to achieve high-quality reconstructions with minimal iterations, achieving 38.34 dB PSNR and 0.9518 SSIM for 18-view CT using only \textbf{10} steps on AAPM-LDCT dataset. Extensive experiments demonstrate the superiority of CvG-Diff over state-of-the-art sparse-view CT reconstruction methods. The code is available at https://github.com/xmed-lab/CvG-Diff.

Healthcare and cutting-edge technology: Advancements, challenges, and future prospects.

Singhal V, R S, Singhal S, Tiwari A, Mangal D

pubmed logopapersAug 14 2025
The high-level integration of technology in health care has radically changed the process of patient care, diagnosis, treatment, and health outcomes. This paper discusses significant technological advances: AI for medical imaging to detect early disease stages; robotic surgery with precision and minimally invasive techniques; telemedicine for remote monitoring and virtual consultation; personalized medicine through genomic analysis; and blockchain in secure and transparent handling of health data. Every section in the paper discusses the underlying principles, advantages, and disadvantages associated with such technologies, supported by appropriate case studies like deploying AI in radiology to enhance cancer diagnosis or robotic surgery to enhance accuracy in surgery and blockchain technology in electronic health records to enable data integrity and security. The paper also discusses key ethical issues, including risks to data privacy, algorithmic bias in AI-based diagnosis, patient consent problems in genomic medicine, and regulatory issues blocking the large-scale adoption of digital health solutions. The article also includes some recommended avenues of future research in the spaces where interdisciplinary cooperation, effective cybersecurity frameworks, and policy transformations are urgently required to ensure that new healthcare technology adoption is ethical and responsible. The work is aimed at delivering important information for policymakers and researchers who are interested in the changing roles of technology to improve healthcare provision and patient outcomes, as well as healthcare practitioners.

AI-based prediction of best-corrected visual acuity in patients with multiple retinal diseases using multimodal medical imaging.

Dong L, Gao W, Niu L, Deng Z, Gong Z, Li HY, Fang LJ, Shao L, Zhang RH, Zhou WD, Ma L, Wei WB

pubmed logopapersAug 14 2025
This study evaluated the performance of artificial intelligence (AI) algorithms in predicting best-corrected visual acuity (BCVA) for patients with multiple retinal diseases, using multimodal medical imaging including macular optical coherence tomography (OCT), optic disc OCT and fundus images. The goal was to enhance clinical BCVA evaluation efficiency and precision. A retrospective study used data from 2545 patients (4028 eyes) for training, 896 (1006 eyes) for testing and 196 (200 eyes) for internal validation, with an external prospective dataset of 741 patients (1381 eyes). Single-modality analyses employed different backbone networks and feature fusion methods, while multimodal fusion combined modalities using average aggregation, concatenation/reduction and maximum feature selection. Predictive accuracy was measured by mean absolute error (MAE), root mean squared error (RMSE) and R² score. Macular OCT achieved better single-modality prediction than optic disc OCT, with MAE of 3.851 vs 4.977 and RMSE of 7.844 vs 10.026. Fundus images showed an MAE of 3.795 and RMSE of 7.954. Multimodal fusion significantly improved accuracy, with the best results using average aggregation, achieving an MAE of 2.865, RMSE of 6.229 and R² of 0.935. External validation yielded an MAE of 8.38 and RMSE of 10.62. Multimodal fusion provided the most accurate BCVA predictions, demonstrating AI's potential to improve clinical evaluation. However, challenges remain regarding disease diversity and applicability in resource-limited settings.

Exploring the potential of generative artificial intelligence in medical image synthesis: opportunities, challenges, and future directions.

Khosravi B, Purkayastha S, Erickson BJ, Trivedi HM, Gichoya JW

pubmed logopapersAug 14 2025
Generative artificial intelligence has emerged as a transformative force in medical imaging since 2022, enabling the creation of derivative synthetic datasets that closely resemble real-world data. This Viewpoint examines key aspects of synthetic data, focusing on its advancements, applications, and challenges in medical imaging. Various generative artificial intelligence image generation paradigms, such as physics-informed and statistical models, and their potential to augment and diversify medical research resources are explored. The promises of synthetic datasets, including increased diversity, privacy preservation, and multifunctionality, are also discussed, along with their ability to model complex biological phenomena. Next, specific applications using synthetic data such as enhancing medical education, augmenting rare disease datasets, improving radiology workflows, and enabling privacy-preserving multicentre collaborations are highlighted. The challenges and ethical considerations surrounding generative artificial intelligence, including patient privacy, data copying, and potential biases that could impede clinical translation, are also addressed. Finally, future directions for research and development in this rapidly evolving field are outlined, emphasising the need for robust evaluation frameworks and responsible utilisation of generative artificial intelligence in medical imaging.

Quest for a clinically relevant medical image segmentation metric: the definition and implementation of Medical Similarity Index

Szuzina Fazekas, Bettina Katalin Budai, Viktor Bérczi, Pál Maurovich-Horvat, Zsolt Vizi

arxiv logopreprintAug 13 2025
Background: In the field of radiology and radiotherapy, accurate delineation of tissues and organs plays a crucial role in both diagnostics and therapeutics. While the gold standard remains expert-driven manual segmentation, many automatic segmentation methods are emerging. The evaluation of these methods primarily relies on traditional metrics that only incorporate geometrical properties and fail to adapt to various applications. Aims: This study aims to develop and implement a clinically relevant segmentation metric that can be adapted for use in various medical imaging applications. Methods: Bidirectional local distance was defined, and the points of the test contour were paired with points of the reference contour. After correcting for the distance between the test and reference center of mass, Euclidean distance was calculated between the paired points, and a score was given to each test point. The overall medical similarity index was calculated as the average score across all the test points. For demonstration, we used myoma and prostate datasets; nnUNet neural networks were trained for segmentation. Results: An easy-to-use, sustainable image processing pipeline was created using Python. The code is available in a public GitHub repository along with Google Colaboratory notebooks. The algorithm can handle multislice images with multiple masks per slice. Mask splitting algorithm is also provided that can separate the concave masks. We demonstrate the adaptability with prostate segmentation evaluation. Conclusions: A novel segmentation evaluation metric was implemented, and an open-access image processing pipeline was also provided, which can be easily used for automatic measurement of clinical relevance of medical image segmentation.}

AST-n: A Fast Sampling Approach for Low-Dose CT Reconstruction using Diffusion Models

Tomás de la Sotta, José M. Saavedra, Héctor Henríquez, Violeta Chang, Aline Xavier

arxiv logopreprintAug 13 2025
Low-dose CT (LDCT) protocols reduce radiation exposure but increase image noise, compromising diagnostic confidence. Diffusion-based generative models have shown promise for LDCT denoising by learning image priors and performing iterative refinement. In this work, we introduce AST-n, an accelerated inference framework that initiates reverse diffusion from intermediate noise levels, and integrate high-order ODE solvers within conditioned models to further reduce sampling steps. We evaluate two acceleration paradigms--AST-n sampling and standard scheduling with high-order solvers -- on the Low Dose CT Grand Challenge dataset, covering head, abdominal, and chest scans at 10-25 % of standard dose. Conditioned models using only 25 steps (AST-25) achieve peak signal-to-noise ratio (PSNR) above 38 dB and structural similarity index (SSIM) above 0.95, closely matching standard baselines while cutting inference time from ~16 seg to under 1 seg per slice. Unconditional sampling suffers substantial quality loss, underscoring the necessity of conditioning. We also assess DDIM inversion, which yields marginal PSNR gains at the cost of doubling inference time, limiting its clinical practicality. Our results demonstrate that AST-n with high-order samplers enables rapid LDCT reconstruction without significant loss of image fidelity, advancing the feasibility of diffusion-based methods in clinical workflows.
Page 5 of 81804 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.