Sort by:
Page 9 of 58574 results

A Modified VGG19-Based Framework for Accurate and Interpretable Real-Time Bone Fracture Detection

Md. Ehsanul Haque, Abrar Fahim, Shamik Dey, Syoda Anamika Jahan, S. M. Jahidul Islam, Sakib Rokoni, Md Sakib Morshed

arxiv logopreprintJul 31 2025
Early and accurate detection of the bone fracture is paramount to initiating treatment as early as possible and avoiding any delay in patient treatment and outcomes. Interpretation of X-ray image is a time consuming and error prone task, especially when resources for such interpretation are limited by lack of radiology expertise. Additionally, deep learning approaches used currently, typically suffer from misclassifications and lack interpretable explanations to clinical use. In order to overcome these challenges, we propose an automated framework of bone fracture detection using a VGG-19 model modified to our needs. It incorporates sophisticated preprocessing techniques that include Contrast Limited Adaptive Histogram Equalization (CLAHE), Otsu's thresholding, and Canny edge detection, among others, to enhance image clarity as well as to facilitate the feature extraction. Therefore, we use Grad-CAM, an Explainable AI method that can generate visual heatmaps of the model's decision making process, as a type of model interpretability, for clinicians to understand the model's decision making process. It encourages trust and helps in further clinical validation. It is deployed in a real time web application, where healthcare professionals can upload X-ray images and get the diagnostic feedback within 0.5 seconds. The performance of our modified VGG-19 model attains 99.78\% classification accuracy and AUC score of 1.00, making it exceptionally good. The framework provides a reliable, fast, and interpretable solution for bone fracture detection that reasons more efficiently for diagnoses and better patient care.

Applications of artificial intelligence and advanced imaging in pediatric diffuse midline glioma.

Haddadi Avval A, Banerjee S, Zielke J, Kann BH, Mueller S, Rauschecker AM

pubmed logopapersJul 30 2025
Diffuse midline glioma (DMG) is a rare, aggressive, and fatal tumor that largely occurs in the pediatric population. To improve outcomes, it is important to characterize DMGs, which can be performed via magnetic resonance imaging (MRI) assessment. Recently, artificial intelligence (AI) and advanced imaging have demonstrated their potential to improve the evaluation of various brain tumors, gleaning more information from imaging data than is possible without these methods. This narrative review compiles the existing literature on the intersection of MRI-based AI use and DMG tumors. The applications of AI in DMG revolve around classification and diagnosis, segmentation, radiogenomics, and prognosis/survival prediction. Currently published articles have utilized a wide spectrum of AI algorithms, from traditional machine learning and radiomics to neural networks. Challenges include the lack of cohorts of DMG patients with publicly available, multi-institutional, multimodal imaging and genomics datasets as well as the overall rarity of the disease. As an adjunct to AI, advanced MRI techniques, including diffusion-weighted imaging, perfusion-weighted imaging, and Magnetic Resonance Spectroscopy (MRS), as well as positron emission tomography (PET), provide additional insights into DMGs. Establishing AI models in conjunction with advanced imaging modalities has the potential to push clinical practice toward precision medicine.

Reference-Guided Diffusion Inpainting For Multimodal Counterfactual Generation

Alexandru Buburuzan

arxiv logopreprintJul 30 2025
Safety-critical applications, such as autonomous driving and medical image analysis, require extensive multimodal data for rigorous testing. Synthetic data methods are gaining prominence due to the cost and complexity of gathering real-world data, but they demand a high degree of realism and controllability to be useful. This work introduces two novel methods for synthetic data generation in autonomous driving and medical image analysis, namely MObI and AnydoorMed, respectively. MObI is a first-of-its-kind framework for Multimodal Object Inpainting that leverages a diffusion model to produce realistic and controllable object inpaintings across perceptual modalities, demonstrated simultaneously for camera and lidar. Given a single reference RGB image, MObI enables seamless object insertion into existing multimodal scenes at a specified 3D location, guided by a bounding box, while maintaining semantic consistency and multimodal coherence. Unlike traditional inpainting methods that rely solely on edit masks, this approach uses 3D bounding box conditioning to ensure accurate spatial positioning and realistic scaling. AnydoorMed extends this paradigm to the medical imaging domain, focusing on reference-guided inpainting for mammography scans. It leverages a diffusion-based model to inpaint anomalies with impressive detail preservation, maintaining the reference anomaly's structural integrity while semantically blending it with the surrounding tissue. Together, these methods demonstrate that foundation models for reference-guided inpainting in natural images can be readily adapted to diverse perceptual modalities, paving the way for the next generation of systems capable of constructing highly realistic, controllable and multimodal counterfactual scenarios.

Deep Learning for the Diagnosis and Treatment of Thyroid Cancer: A Review.

Gao R, Mai S, Wang S, Hu W, Chang Z, Wu G, Guan H

pubmed logopapersJul 30 2025
In recent years, the application of deep learning (DL) technology in the thyroid field has shown exponential growth, greatly promoting innovation in thyroid disease research. As the most common malignant tumor of the endocrine system, the precise diagnosis and treatment of thyroid cancer has been a key focus of clinical research. This article systematically reviews the latest research progress in DL research for the diagnosis and treatment of thyroid malignancies, focusing on the breakthrough application of advanced models such as convolutional neural networks (CNNs), long short-term memory networks (LSTMs), and generative adversarial networks (GANs) in key areas such as ultrasound images analysis for thyroid nodules, automatic classification of pathological images, and assessment of extrathyroidal extension. Furthermore, the review highlights the great potential of DL techniques in the development of individualized treatment planning and prognosis prediction. In addition, it analyzes the technical bottlenecks and clinical challenges faced by current DL applications in thyroid cancer diagnosis and treatment and looks ahead to future directions for development. The aim of this review is to provide the latest research insights for clinical practitioners, promote further improvements in the precision diagnosis and treatment system for thyroid cancer, and ultimately achieve better diagnostic and therapeutic outcomes for thyroid cancer patients.

VidFuncta: Towards Generalizable Neural Representations for Ultrasound Videos

Julia Wolleb, Florentin Bieder, Paul Friedrich, Hemant D. Tagare, Xenophon Papademetris

arxiv logopreprintJul 29 2025
Ultrasound is widely used in clinical care, yet standard deep learning methods often struggle with full video analysis due to non-standardized acquisition and operator bias. We offer a new perspective on ultrasound video analysis through implicit neural representations (INRs). We build on Functa, an INR framework in which each image is represented by a modulation vector that conditions a shared neural network. However, its extension to the temporal domain of medical videos remains unexplored. To address this gap, we propose VidFuncta, a novel framework that leverages Functa to encode variable-length ultrasound videos into compact, time-resolved representations. VidFuncta disentangles each video into a static video-specific vector and a sequence of time-dependent modulation vectors, capturing both temporal dynamics and dataset-level redundancies. Our method outperforms 2D and 3D baselines on video reconstruction and enables downstream tasks to directly operate on the learned 1D modulation vectors. We validate VidFuncta on three public ultrasound video datasets -- cardiac, lung, and breast -- and evaluate its downstream performance on ejection fraction prediction, B-line detection, and breast lesion classification. These results highlight the potential of VidFuncta as a generalizable and efficient representation framework for ultrasound videos. Our code is publicly available under https://github.com/JuliaWolleb/VidFuncta_public.

Distribution-Based Masked Medical Vision-Language Model Using Structured Reports

Shreyank N Gowda, Ruichi Zhang, Xiao Gu, Ying Weng, Lu Yang

arxiv logopreprintJul 29 2025
Medical image-language pre-training aims to align medical images with clinically relevant text to improve model performance on various downstream tasks. However, existing models often struggle with the variability and ambiguity inherent in medical data, limiting their ability to capture nuanced clinical information and uncertainty. This work introduces an uncertainty-aware medical image-text pre-training model that enhances generalization capabilities in medical image analysis. Building on previous methods and focusing on Chest X-Rays, our approach utilizes structured text reports generated by a large language model (LLM) to augment image data with clinically relevant context. These reports begin with a definition of the disease, followed by the `appearance' section to highlight critical regions of interest, and finally `observations' and `verdicts' that ground model predictions in clinical semantics. By modeling both inter- and intra-modal uncertainty, our framework captures the inherent ambiguity in medical images and text, yielding improved representations and performance on downstream tasks. Our model demonstrates significant advances in medical image-text pre-training, obtaining state-of-the-art performance on multiple downstream tasks.

Time-series X-ray image prediction of dental skeleton treatment progress via neural networks.

Kwon SW, Moon JK, Song SC, Cha JY, Kim YW, Choi YJ, Lee JS

pubmed logopapersJul 29 2025
Accurate prediction of skeletal changes during orthodontic treatment in growing patients remains challenging due to significant individual variability in craniofacial growth and treatment responses. Conventional methods, such as support vector regression and multilayer perceptrons, require multiple sequential radiographs to achieve acceptable accuracy. However, they are limited by increased radiation exposure, susceptibility to landmark identification errors, and the lack of visually interpretable predictions. To overcome these limitations, this study explored advanced generative approaches, including denoising diffusion probabilistic models (DDPMs), latent diffusion models (LDMs), and ControlNet, to predict future cephalometric radiographs using minimal input data. We evaluated three diffusion-based models-a DDPM utilizing three sequential cephalometric images (3-input DDPM), a single-image DDPM (1-input DDPM), and a single-image LDM-and a vision-based generative model, ControlNet, conditioned on patient-specific attributes such as age, sex, and orthodontic treatment type. Quantitative evaluations demonstrated that the 3-input DDPM achieved the highest numerical accuracy, whereas the single-image LDM delivered comparable predictive performance with significantly reduced clinical requirements. ControlNet also exhibited competitive accuracy, highlighting its potential effectiveness in clinical scenarios. These findings indicate that the single-image LDM and ControlNet offer practical solutions for personalized orthodontic treatment planning, reducing patient visits and radiation exposure while maintaining robust predictive accuracy.

Neural Autoregressive Modeling of Brain Aging

Ridvan Yesiloglu, Wei Peng, Md Tauhidul Islam, Ehsan Adeli

arxiv logopreprintJul 29 2025
Brain aging synthesis is a critical task with broad applications in clinical and computational neuroscience. The ability to predict the future structural evolution of a subject's brain from an earlier MRI scan provides valuable insights into aging trajectories. Yet, the high-dimensionality of data, subtle changes of structure across ages, and subject-specific patterns constitute challenges in the synthesis of the aging brain. To overcome these challenges, we propose NeuroAR, a novel brain aging simulation model based on generative autoregressive transformers. NeuroAR synthesizes the aging brain by autoregressively estimating the discrete token maps of a future scan from a convenient space of concatenated token embeddings of a previous and future scan. To guide the generation, it concatenates into each scale the subject's previous scan, and uses its acquisition age and the target age at each block via cross-attention. We evaluate our approach on both the elderly population and adolescent subjects, demonstrating superior performance over state-of-the-art generative models, including latent diffusion models (LDM) and generative adversarial networks, in terms of image fidelity. Furthermore, we employ a pre-trained age predictor to further validate the consistency and realism of the synthesized images with respect to expected aging patterns. NeuroAR significantly outperforms key models, including LDM, demonstrating its ability to model subject-specific brain aging trajectories with high fidelity.

Evaluation of GPT-4o for multilingual translation of radiology reports across imaging modalities.

Terzis R, Salam B, Nowak S, Mueller PT, Mesropyan N, Oberlinkels L, Efferoth AF, Kravchenko D, Voigt M, Ginzburg D, Pieper CC, Hayawi M, Kuetting D, Afat S, Maintz D, Luetkens JA, Kaya K, Isaak A

pubmed logopapersJul 29 2025
Large language models (LLMs) like GPT-4o offer multilingual and real-time translation capabilities. This study aims to evaluate GPT-4o's effectiveness in translating radiology reports into different languages. In this experimental two-center study, 100 real-world radiology reports from four imaging modalities (X-ray, ultrasound, CT, MRI) were randomly selected and fully anonymized. Reports were translated using GPT-4o with zero-shot prompting from German into four languages including English, French, Spanish, and Russian (n = 400 translations). Eight bilingual radiologists (two per language) evaluated the translations for general readability, overall quality, and utility for translators using 5-point Likert scales (ranging from 5 [best score] to 1 [worst score]). Binary questions (yes/no) were conducted to evaluate potential harmful errors, completeness, and factual correctness. The average processing time of GPT-4o for translating reports ranged from 9 to 24 s. The overall quality of translations achieved a median of 4.5 (IQR 4-5), with English (5 [4,5]), French and Spanish (each 4.5 [4,5]) significantly outperforming Russian (4 [3.5-4]; each p < 0.05). Usefulness for translators was rated highest for English (5 [5-5], p < 0.05 against other languages). Readability scores and translation completeness were significantly higher for translations into Spanish, English and French compared to Russian (each p < 0.05). Factual correctness averaged 79 %, with English (84 %) and French (83 %) outperforming Russian (69 %) (each p < 0.05). Potentially harmful errors were identified in 4 % of translations, primarily in Russian (9 %). GPT-4o demonstrated robust performance in translating radiology reports across multiple languages, with limitations observed in Russian translations.

A radiomics-based interpretable model integrating delayed-phase CT and clinical features for predicting the pathological grade of appendiceal pseudomyxoma peritonei.

Bai D, Shi G, Liang Y, Li F, Zheng Z, Wang Z

pubmed logopapersJul 28 2025
This study aimed to develop an interpretable machine learning model integrating delayed-phase contrast-enhanced CT radiomics with clinical features for noninvasive prediction of pathological grading in appendiceal pseudomyxoma peritonei (PMP), using Shapley Additive Explanations (SHAP) for model interpretation. This retrospective study analyzed 158 pathologically confirmed PMP cases (85 low-grade, 73 high-grade) from January 4, 2015 to April 30, 2024. Comprehensive clinical data including demographic characteristics, serum tumor markers (CEA, CA19-9, CA125, D-dimer, CA-724, CA-242), and CT-peritoneal cancer index (CT-PCI) were collected. Radiomics features were extracted from preoperative contrast-enhanced CT scans using standardized protocols. After rigorous feature selection and five-fold cross-validation, we developed three predictive models: clinical-only, radiomics-only, and a combined clinical-radiomics model using logistic regression. Model performance was evaluated through ROC analysis (AUC), Delong test, decision curve analysis (DCA), and Brier score, with SHAP values providing interpretability. The combined model demonstrated superior performance, achieving AUCs of 0.91 (95%CI:0.86-0.95) and 0.88 (95%CI:0.82-0.93) in training and testing sets respectively, significantly outperforming standalone models (P < 0.05). DCA confirmed greater clinical utility across most threshold probabilities, with favorable Brier scores (training:0.124; testing:0.142) indicating excellent calibration. SHAP analysis identified the top predictive features: wavelet-LHH_glcm_InverseVariance (radiomics), original_shape_Elongation (radiomics), and CA-199 (clinical). Our SHAP-interpretable combined model provides an accurate, noninvasive tool for PMP grading, facilitating personalized treatment decisions. The integration of radiomics and clinical data demonstrates superior predictive performance compared to conventional approaches, with potential to improve patient outcomes.
Page 9 of 58574 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.