Sort by:
Page 175 of 3973969 results

Efficacy of Image Similarity as a Metric for Augmenting Small Dataset Retinal Image Segmentation

Thomas Wallace, Ik Siong Heng, Senad Subasic, Chris Messenger

arxiv logopreprintJul 7 2025
Synthetic images are an option for augmenting limited medical imaging datasets to improve the performance of various machine learning models. A common metric for evaluating synthetic image quality is the Fr\'echet Inception Distance (FID) which measures the similarity of two image datasets. In this study we evaluate the relationship between this metric and the improvement which synthetic images, generated by a Progressively Growing Generative Adversarial Network (PGGAN), grant when augmenting Diabetes-related Macular Edema (DME) intraretinal fluid segmentation performed by a U-Net model with limited amounts of training data. We find that the behaviour of augmenting with standard and synthetic images agrees with previously conducted experiments. Additionally, we show that dissimilar (high FID) datasets do not improve segmentation significantly. As FID between the training and augmenting datasets decreases, the augmentation datasets are shown to contribute to significant and robust improvements in image segmentation. Finally, we find that there is significant evidence to suggest that synthetic and standard augmentations follow separate log-normal trends between FID and improvements in model performance, with synthetic data proving more effective than standard augmentation techniques. Our findings show that more similar datasets (lower FID) will be more effective at improving U-Net performance, however, the results also suggest that this improvement may only occur when images are sufficiently dissimilar.

SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model

Chun Xie, Yuichi Yoshii, Itaru Kitahara

arxiv logopreprintJul 7 2025
X-ray imaging is a rapid and cost-effective tool for visualizing internal human anatomy. While multi-view X-ray imaging provides complementary information that enhances diagnosis, intervention, and education, acquiring images from multiple angles increases radiation exposure and complicates clinical workflows. To address these challenges, we propose a novel view-conditioned diffusion model for synthesizing multi-view X-ray images from a single view. Unlike prior methods, which are limited in angular range, resolution, and image quality, our approach leverages the Diffusion Transformer to preserve fine details and employs a weak-to-strong training strategy for stable high-resolution image generation. Experimental results demonstrate that our method generates higher-resolution outputs with improved control over viewing angles. This capability has significant implications not only for clinical applications but also for medical education and data extension, enabling the creation of diverse, high-quality datasets for training and analysis. Our code is available at GitHub.

Sequential Attention-based Sampling for Histopathological Analysis

Tarun G, Naman Malpani, Gugan Thoppe, Sridharan Devarajan

arxiv logopreprintJul 7 2025
Deep neural networks are increasingly applied for automated histopathology. Yet, whole-slide images (WSIs) are often acquired at gigapixel sizes, rendering it computationally infeasible to analyze them entirely at high resolution. Diagnostic labels are largely available only at the slide-level, because expert annotation of images at a finer (patch) level is both laborious and expensive. Moreover, regions with diagnostic information typically occupy only a small fraction of the WSI, making it inefficient to examine the entire slide at full resolution. Here, we propose SASHA -- {\it S}equential {\it A}ttention-based {\it S}ampling for {\it H}istopathological {\it A}nalysis -- a deep reinforcement learning approach for efficient analysis of histopathological images. First, SASHA learns informative features with a lightweight hierarchical, attention-based multiple instance learning (MIL) model. Second, SASHA samples intelligently and zooms selectively into a small fraction (10-20\%) of high-resolution patches, to achieve reliable diagnosis. We show that SASHA matches state-of-the-art methods that analyze the WSI fully at high-resolution, albeit at a fraction of their computational and memory costs. In addition, it significantly outperforms competing, sparse sampling methods. We propose SASHA as an intelligent sampling model for medical imaging challenges that involve automated diagnosis with exceptionally large images containing sparsely informative features.

HGNet: High-Order Spatial Awareness Hypergraph and Multi-Scale Context Attention Network for Colorectal Polyp Detection

Xiaofang Liu, Lingling Sun, Xuqing Zhang, Yuannong Ye, Bin zhao

arxiv logopreprintJul 7 2025
Colorectal cancer (CRC) is closely linked to the malignant transformation of colorectal polyps, making early detection essential. However, current models struggle with detecting small lesions, accurately localizing boundaries, and providing interpretable decisions. To address these issues, we propose HGNet, which integrates High-Order Spatial Awareness Hypergraph and Multi-Scale Context Attention. Key innovations include: (1) an Efficient Multi-Scale Context Attention (EMCA) module to enhance lesion feature representation and boundary modeling; (2) the deployment of a spatial hypergraph convolution module before the detection head to capture higher-order spatial relationships between nodes; (3) the application of transfer learning to address the scarcity of medical image data; and (4) Eigen Class Activation Map (Eigen-CAM) for decision visualization. Experimental results show that HGNet achieves 94% accuracy, 90.6% recall, and 90% [email protected], significantly improving small lesion differentiation and clinical interpretability. The source code will be made publicly available upon publication of this paper.

Geometric-Guided Few-Shot Dental Landmark Detection with Human-Centric Foundation Model

Anbang Wang, Marawan Elbatel, Keyuan Liu, Lizhuo Lin, Meng Lan, Yanqi Yang, Xiaomeng Li

arxiv logopreprintJul 7 2025
Accurate detection of anatomic landmarks is essential for assessing alveolar bone and root conditions, thereby optimizing clinical outcomes in orthodontics, periodontics, and implant dentistry. Manual annotation of landmarks on cone-beam computed tomography (CBCT) by dentists is time-consuming, labor-intensive, and subject to inter-observer variability. Deep learning-based automated methods present a promising approach to streamline this process efficiently. However, the scarcity of training data and the high cost of expert annotations hinder the adoption of conventional deep learning techniques. To overcome these challenges, we introduce GeoSapiens, a novel few-shot learning framework designed for robust dental landmark detection using limited annotated CBCT of anterior teeth. Our GeoSapiens framework comprises two key components: (1) a robust baseline adapted from Sapiens, a foundational model that has achieved state-of-the-art performance in human-centric vision tasks, and (2) a novel geometric loss function that improves the model's capacity to capture critical geometric relationships among anatomical structures. Experiments conducted on our collected dataset of anterior teeth landmarks revealed that GeoSapiens surpassed existing landmark detection methods, outperforming the leading approach by an 8.18% higher success detection rate at a strict 0.5 mm threshold-a standard widely recognized in dental diagnostics. Code is available at: https://github.com/xmed-lab/GeoSapiens.

Leveraging Large Language Models for Accurate AO Fracture Classification from CT Text Reports.

Mergen M, Spitzl D, Ketzer C, Strenzke M, Marka AW, Makowski MR, Bressem KK, Adams LC, Gassert FT

pubmed logopapersJul 7 2025
Large language models (LLMs) have shown promising potential in analyzing complex textual data, including radiological reports. These models can assist clinicians, particularly those with limited experience, by integrating and presenting diagnostic criteria within radiological classifications. However, before clinical adoption, LLMs must be rigorously validated by medical professionals to ensure accuracy, especially in the context of advanced radiological classification systems. This study evaluates the performance of four LLMs-ChatGPT-4o, AmbossGPT, Claude 3.5 Sonnet, and Gemini 2.0 Flash-in classifying fractures based on the AO classification system using CT reports. A dataset of 292 fictitious physician-generated CT reports, representing 310 fractures, was used to assess the accuracy of each LLM in AO fracture classification retrospectively. Performance was evaluated by comparing the models' classifications to ground truth labels, with accuracy rates analyzed across different fracture types and subtypes. ChatGPT-4o and AmbossGPT achieved the highest overall accuracy (74.6 and 74.3%, respectively), outperforming Claude 3.5 Sonnet (69.5%) and Gemini 2.0 Flash (62.7%). Statistically significant differences were observed in fracture type classification, particularly between ChatGPT-4o and Gemini 2.0 Flash (Δ12%, p < 0.001). While all models demonstrated strong bone recognition rates (90-99%), their accuracy in fracture subtype classification remained lower (71-77%), indicating limitations in nuanced diagnostic categorization. LLMs show potential in assisting radiologists with initial fracture classification, particularly in high-volume or resource-limited settings. However, their performance remains inconsistent for detailed subtype classification, highlighting the need for further refinement and validation before clinical integration in advanced diagnostic workflows.

Gender difference in cross-sectional area and fat infiltration of thigh muscles in the elderly population on MRI: an AI-based analysis.

Bizzozero S, Bassani T, Sconfienza LM, Messina C, Bonato M, Inzaghi C, Marmondi F, Cinque P, Banfi G, Borghi S

pubmed logopapersJul 7 2025
Aging alters musculoskeletal structure and function, affecting muscle mass, composition, and strength, increasing the risk of falls and loss of independence in older adults. This study assessed cross-sectional area (CSA) and fat infiltration (FI) of six thigh muscles through a validated deep learning model. Gender differences and correlations between fat, muscle parameters, and age were also analyzed. We retrospectively analyzed 141 participants (67 females, 74 males) aged 52-82 years. Participants underwent magnetic resonance imaging (MRI) scans of the right thigh and dual-energy x-ray absorptiometry to determine appendicular skeletal muscle mass index (ASMMI) and body fat percentage (FAT%). A deep learning-based application was developed to automate the segmentation of six thigh muscle groups. Deep learning model accuracy was evaluated using the "intersection over union" (IoU) metric, with average IoU values across muscle groups ranging from 0.84 to 0.99. Mean CSA was 10,766.9 mm² (females 8,892.6 mm², males 12,463.9 mm², p < 0.001). The mean FI value was 14.92% (females 17.42%, males 12.62%, p < 0.001). Males showed larger CSA and lower FI in all thigh muscles compared to females. Positive correlations were identified in females between the FI of posterior thigh muscle groups (biceps femoris, semimembranosus, and semitendinosus) and age (r or ρ = 0.35-0.48; p ≤ 0.004), while no significant correlations were observed between CSA, ASMMI, or FAT% and age. Deep learning accurately quantifies muscle CSA and FI, reducing analysis time and human error. Aging impacts on muscle composition and distribution and gender-specific assessments in older adults is needed. Efficient deep learning-based MRI image segmentation to assess the composition of six thigh muscle groups in over 50 individuals revealed gender differences in thigh muscle CSA and FI. These findings have potential clinical applications in assessing muscle quality, decline, and frailty. Deep learning model enhanced MRI segmentation, providing high assessment accuracy. Significant gender differences in cross-sectional area and fat infiltration across all thigh muscles were observed. In females, fat infiltration of the posterior thigh muscles was positively correlated with age.

2-D Stationary Wavelet Transform and 2-D Dual-Tree DWT for MRI Denoising.

Talbi M, Nasraoui B, Alfaidi A

pubmed logopapersJul 7 2025
The noise emergence in the digital image can occur throughout image acquisition, transmission, and processing steps. Consequently, eliminating the noise from the digital image is required before further processing. This study aims to denoise noisy images (including Magnetic Resonance Images (<b>MRIs</b>)) by employing our proposed image denoising approach. This proposed approach is based on the Stationary Wavelet Transform (<b>SWT 2-D</b>) and the <b>2 - D</b> Dual-Tree Discrete Wavelet Transform (<b>DWT</b>). The first step of this approach consists of applying the 2 - D Dual-Tree DWT to the noisy image to obtain noisy wavelet coefficients. The second step of this approach consists of denoising each of these coefficients by applying an SWT 2-D based denoising technique. The denoised image is finally obtained by applying the inverse of the 2-D Dual-Tree <b>DWT</b> to the denoised coefficients obtained in the second step. The proposed image denoising approach is evaluated by comparing it to four denoising techniques existing in literature. The latters are the image denoising technique based on thresholding in the <b>SWT-2D</b> domain, the image denoising technique based on deep neural network, the image denoising technique based on soft thresholding in the domain of 2-D Dual-Tree DWT, and Non-local Means Filter. The proposed denoising approach, and the other four techniques previously mentioned, are applied to a number of noisy grey scale images and noisy Magnetic Resonance Images (MRIs) and the obtained results are in terms of <b>PSNR</b> (Peak Signal to Noise Ratio), <b>SSIM</b> (Structural Similarity), <b>NMSE</b> (Normalized Mean Square Error) and Feature Similarity (<b>FSIM</b>). These results show that the proposed image denoising approach outperforms the other denoising techniques applied for our evaluation. In comparison with the four denoising techniques applied for our evaluation, the proposed approach permits to obtain highest values of <b>PSNR, SSIM</b> and <b>FSIM</b> and the lowest values of <b>NMSE</b>. Moreover, in cases where the noise level <b>σ = 10</b> or <b>σ = 20</b>, this approach permits the elimination of the noise from the noisy images and introduces slight distortions on the details of the original images. However, in case where <b>σ = 30</b> or <b>σ = 40</b>, this approach eliminates a great part of the noise and introduces some distortions on the original images. The performance of this approach is proven by comparing it to four image denoising techniques existing in literature. These techniques are the denoising technique based on thresholding in the SWT-2D domain, the image denoising technique based on a deep neural network, the image denoising technique based on soft thresholding in the domain of <b>2 - D</b> Dual-Tree <b>DWT</b> and the Non-local Means Filter. All these denoising techniques, including our approach, are applied to a number of noisy grey scale images and noisy <b>MRIs</b>, and the obtained results are in terms of <b>PSNR</b> (Peak Signal to Noise Ratio), <b>SSIM</b>(Structural Similarity), <b>NMSE</b> (Normalized Mean Square Error) and <b>FSIM</b> (Feature Similarity). These results show that this proposed approach outperforms the four denoising techniques applied for our evaluation.

Multilayer perceptron deep learning radiomics model based on Gd-BOPTA MRI to identify vessels encapsulating tumor clusters in hepatocellular carcinoma: a multi-center study.

Gu M, Zou W, Chen H, He R, Zhao X, Jia N, Liu W, Wang P

pubmed logopapersJul 7 2025
The purpose of this study is to mainly develop a predictive model based on clinicoradiological and radiomics features from preoperative gadobenate-enhanced (Gd-BOPTA) magnetic resonance imaging (MRI) using multilayer perceptron (MLP) deep learning to predict vessels encapsulating tumor clusters (VETC) in hepatocellular carcinoma (HCC) patients. A total of 230 patients with histopathologically confirmed HCC who underwent preoperative Gd-BOPTA MRI before hepatectomy were retrospectively enrolled from three hospitals (144, 54, and 32 in training, test, and validation set, respectively). Univariate and multivariate logistic regression analyses were used to determine independent clinicoradiological predictors significantly associated with VETC, which then constituted the clinicoradiological model. Regions of interest (ROIs) included four modes, intratumoral (Tumor), peritumoral area ≤ 2 mm (Peri2mm), intratumoral + peritumoral area ≤ 2 mm (Tumor + Peri2mm) and intratumoral integrated with peritumoral ≤ 2 mm as a whole (TumorPeri2mm). A total of 7322 radiomics features were extracted respectively for ROI(Tumor), ROI(Peri2mm), ROI(TumorPeri2mm) and 14644 radiomics features for ROI(Tumor + Peri2mm). Least absolute shrinkage and selection operator (LASSO) and univariate logistic regression analysis were used to select the important features. Seven different machine learning classifiers respectively combined the radiomics signatures selected from four ROIs to constitute different models, and compare the performance between them in three sets and then select the optimal combination to become the radiomics model we need. Then a radiomics score (rad-score) was generated, which combined significant clinicoradiological predictors to constituted the fusion model through multivariate logistic regression analysis. After comparing the performance of the three models using area under receiver operating characteristic curve (AUC), integrated discrimination index (IDI) and net reclassification index (NRI), choose the optimal predictive model for VETC prediction. Arterial peritumoral enhancement and peritumoral hypointensity on hepatobiliary phase (HBP) were independent risk factors for VETC, and constituted the Radiology model, without any clinical variables. Arterial peritumoral enhancement defined as the enhancement outside the tumor boundary in the late stage of arterial phase or early stage of portal phase, extensive contact with the tumor edge, which becomes isointense during the DP. MLP deep learning algorithm integrated radiomics features selected from ROI TumorPeri2mm was the best combination, which constituted the radiomics model (MLP model). A MLP score (MLP_score) was calculated then, which combining the two radiology features composed the fusion model (Radiology MLP model), with AUCs of 0.871, 0.894, 0.918 in the training, test and validation sets. Compared with the two models aforementioned, the Radiology MLP model demonstrated a 33.4%-131.3% improvement in NRI and a 9.3%-50% improvement in IDI, showing better discrimination, calibration and clinical usefulness in three sets, which was selected as the optimal predictive model. We mainly developed a fusion model (Radiology MLP model) that integrated radiology and radiomics features using MLP deep learning algorithm to predict vessels encapsulating tumor clusters (VETC) in hepatocellular carcinoma (HCC) patients, which yield an incremental value over the radiology and the MLP model.

Towards Reliable Healthcare Imaging: A Multifaceted Approach in Class Imbalance Handling for Medical Image Segmentation.

Cui L, Xu M, Liu C, Liu T, Yan X, Zhang Y, Yang X

pubmed logopapersJul 7 2025
Class imbalance is a dominant challenge in medical image segmentation when dealing with MRI images from highly imbalanced datasets. This study introduces a comprehensive, multifaceted approach to enhance the accuracy and reliability of segmentation models under such conditions. Our model integrates advanced data augmentation, innovative algorithmic adjustments, and novel architectural features to address class label distribution effectively. To ensure the multiple aspects of training process, we have customized the data augmentation technique for medical imaging with multi-dimensional angles. The multi-dimensional augmentation technique helps to reduce the bias towards majority classes. We have implemented novel attention mechanisms, i.e., Enhanced Attention Module (EAM) and spatial attention. These attention mechanisms enhance the focus of the model on the most relevant features. Further, our architecture incorporates a dual decoder system and Pooling Integration Layer (PIL) to capture accurate foreground and background details. We also introduce a hybrid loss function, which is designed to handle the class imbalance by guiding the training process. For experimental purposes, we have used multiple datasets such as Digital Database Thyroid Image (DDTI), Breast Ultrasound Images Dataset (BUSI) and LiTS MICCAI 2017 to demonstrate the prowess of the proposed network using key evaluation metrics, i.e., IoU, Dice coefficient, precision, and recall.
Page 175 of 3973969 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.