Sort by:
Page 27 of 46453 results

Deep learning for gender estimation using hand radiographs: a comparative evaluation of CNN models.

Ulubaba HE, Atik İ, Çiftçi R, Eken Ö, Aldhahi MI

pubmed logopapersJul 1 2025
Accurate gender estimation plays a crucial role in forensic identification, especially in mass disasters or cases involving fragmented or decomposed remains where traditional skeletal landmarks are unavailable. This study aimed to develop a deep learning-based model for gender classification using hand radiographs, offering a rapid and objective alternative to conventional methods. We analyzed 470 left-hand X-ray images from adults aged 18 to 65 years using four convolutional neural network (CNN) architectures: ResNet-18, ResNet-50, InceptionV3, and EfficientNet-B0. Following image preprocessing and data augmentation, models were trained and validated using standard classification metrics: accuracy, precision, recall, and F1 score. Data augmentation included random rotation, horizontal flipping, and brightness adjustments to enhance model generalization. Among the tested models, ResNet-50 achieved the highest classification accuracy (93.2%) with precision of 92.4%, recall of 93.3%, and F1 score of 92.5%. While other models demonstrated acceptable performance, ResNet-50 consistently outperformed them across all metrics. These findings suggest CNNs can reliably extract sexually dimorphic features from hand radiographs. Deep learning approaches, particularly ResNet-50, provide a robust, scalable, and efficient solution for gender prediction from hand X-ray images. This method may serve as a valuable tool in forensic scenarios where speed and reliability are critical. Future research should validate these findings across diverse populations and incorporate explainable AI techniques to enhance interpretability.

MedGround-R1: Advancing Medical Image Grounding via Spatial-Semantic Rewarded Group Relative Policy Optimization

Huihui Xu, Yuanpeng Nie, Hualiang Wang, Ying Chen, Wei Li, Junzhi Ning, Lihao Liu, Hongqiu Wang, Lei Zhu, Jiyao Liu, Xiaomeng Li, Junjun He

arxiv logopreprintJul 1 2025
Medical Image Grounding (MIG), which involves localizing specific regions in medical images based on textual descriptions, requires models to not only perceive regions but also deduce spatial relationships of these regions. Existing Vision-Language Models (VLMs) for MIG often rely on Supervised Fine-Tuning (SFT) with large amounts of Chain-of-Thought (CoT) reasoning annotations, which are expensive and time-consuming to acquire. Recently, DeepSeek-R1 demonstrated that Large Language Models (LLMs) can acquire reasoning abilities through Group Relative Policy Optimization (GRPO) without requiring CoT annotations. In this paper, we adapt the GRPO reinforcement learning framework to VLMs for Medical Image Grounding. We propose the Spatial-Semantic Rewarded Group Relative Policy Optimization to train the model without CoT reasoning annotations. Specifically, we introduce Spatial-Semantic Rewards, which combine spatial accuracy reward and semantic consistency reward to provide nuanced feedback for both spatially positive and negative completions. Additionally, we propose to use the Chain-of-Box template, which integrates visual information of referring bounding boxes into the <think> reasoning process, enabling the model to explicitly reason about spatial regions during intermediate steps. Experiments on three datasets MS-CXR, ChestX-ray8, and M3D-RefSeg demonstrate that our method achieves state-of-the-art performance in Medical Image Grounding. Ablation studies further validate the effectiveness of each component in our approach. Code, checkpoints, and datasets are available at https://github.com/bio-mlhui/MedGround-R1

Cephalometric landmark detection using vision transformers with direct coordinate prediction.

Laitenberger F, Scheuer HT, Scheuer HA, Lilienthal E, You S, Friedrich RE

pubmed logopapersJul 1 2025
Cephalometric Landmark Detection (CLD), i.e. annotating interest points in lateral X-ray images, is the crucial first step of every orthodontic therapy. While CLD has immense potential for automation using Deep Learning methods, carefully crafted contemporary approaches using convolutional neural networks and heatmap prediction do not qualify for large-scale clinical application due to insufficient performance. We propose a novel approach using Vision Transformers (ViTs) with direct coordinate prediction, avoiding the memory-intensive heatmap prediction common in previous work. Through extensive ablation studies comparing our method against contemporary CNN architectures (ConvNext V2) and heatmap-based approaches (Segformer), we demonstrate that ViTs with coordinate prediction achieve superior performance with more than 2 mm improvement in mean radial error compared to state-of-the-art CLD methods. Our results show that while non-adapted CNN architectures perform poorly on the given task, contemporary approaches may be too tailored to specific datasets, failing to generalize to different and especially sparse datasets. We conclude that using general-purpose Vision Transformers with direct coordinate prediction shows great promise for future research on CLD and medical computer vision.

Multi-scale geometric transformer for sparse-view X-ray 3D foot reconstruction.

Wang W, An L, Han G

pubmed logopapersJul 1 2025
Sparse-View X-ray 3D Foot Reconstruction aims to reconstruct the three-dimensional structure of the foot from sparse-view X-ray images, a challenging task due to data sparsity and limited viewpoints. This paper presents a novel method using a multi-scale geometric Transformer to enhance reconstruction accuracy and detail representation. Geometric position encoding technology and a window mechanism are introduced to divide X-ray images into local areas, finely capturing local features. A multi-scale Transformer module based on Neural Radiance Fields (NeRF) enhances the model's ability to express and capture details in complex structures. An adaptive weight learning strategy further optimizes the Transformer's feature extraction and long-range dependency modelling. Experimental results demonstrate that the proposed method significantly improves the reconstruction accuracy and detail preservation of the foot structure under sparse-view X-ray conditions. The multi-scale geometric Transformer effectively captures local and global features, leading to more accurate and detailed 3D reconstructions. The proposed method advances medical image reconstruction, significantly improving the accuracy and detail preservation of 3D foot reconstructions from sparse-view X-ray images.

Dual-threshold sample selection with latent tendency difference for label-noise-robust pneumoconiosis staging.

Zhang S, Ren X, Qiang Y, Zhao J, Qiao Y, Yue H

pubmed logopapersJul 1 2025
BackgroundThe precise pneumoconiosis staging suffers from progressive pair label noise (PPLN) in chest X-ray datasets, because adjacent stages are confused due to unidentifialble and diffuse opacities in the lung fields. As deep neural networks are employed to aid the disease staging, the performance is degraded under such label noise.ObjectiveThis study improves the effectiveness of pneumoconiosis staging by mitigating the impact of PPLN through network architecture refinement and sample selection mechanism adjustment.MethodsWe propose a novel multi-branch architecture that incorporates the dual-threshold sample selection. Several auxiliary branches are integrated in a two-phase module to learn and predict the <i>progressive feature tendency</i>. A novel difference-based metric is introduced to iteratively obtained the instance-specific thresholds as a complementary criterion of dynamic sample selection. All the samples are finally partitioned into <i>clean</i> and <i>hard</i> sets according to dual-threshold criteria and treated differently by loss functions with penalty terms.ResultsCompared with the state-of-the-art, the proposed method obtains the best metrics (accuracy: 90.92%, precision: 84.25%, sensitivity: 81.11%, F1-score: 82.06%, and AUC: 94.64%) under real-world PPLN, and is less sensitive to the rise of synthetic PPLN rate. An ablation study validates the respective contributions of critical modules and demonstrates how variations of essential hyperparameters affect model performance.ConclusionsThe proposed method achieves substantial effectiveness and robustness against PPLN in pneumoconiosis dataset, and can further assist physicians in diagnosing the disease with a higher accuracy and confidence.

ARTIFICIAL INTELLIGENCE ENHANCES DIAGNOSTIC ACCURACY OF CONTRAST ENEMAS IN HIRSCHSPRUNG DISEASE COMPARED TO CLINICAL EXPERTS.

Vargova P, Varga M, Izquierdo Hernandez B, Gutierrez Alonso C, Gonzalez Esgueda A, Cobos Hernandez MV, Fernandez R, González-Ruiz Y, Bragagnini Rodriguez P, Del Peral Samaniego M, Corona Bellostas C

pubmed logopapersJul 1 2025
Introduction Contrast enema (CE) is widely used in the evaluation of suspected Hirschsprung disease (HD). Deep learning is a promising tool to standardize image assessment and support clinical decision-making. This study assesses the diagnostic performance of a deep neural network (DNN), with and without clinical data, and compares its interpretation with that of pediatric surgeons and radiologists. Materials and Methods In this retrospective study, 1471 contrast enema images from patients <15 years were analysed, with 218 images used for testing. A deep neural network, pediatric radiologists, and surgeons independently reviewed the testing set, with and without clinical data. Diagnostic performance was assessed using ROC and PR curves, and interobserver agreement was evaluated using Fleiss' kappa. Results The deep neural network achieved high diagnostic accuracy (AUC-ROC = 0.87) in contrast enema interpretation, with improved performance when combining anteroposterior and lateral images (AUC-ROC = 0.92). Clinical data integration further enhanced model sensitivity and negative predictive value. The super-surgeon (majority voting of colorectal surgeons) outperformed most individual clinicians (sensitivity 81.8%, specificity 79.1%), while the super-radiologist (majority voting of radiologist) showed moderate accuracy. Interobserver analysis revealed strong agreement between the model and surgeons (Cohen's kappa = 0.73), and overall consistency among experts and the model (Fleiss' kappa = 0.62). Conclusions AI-assisted CE interpretation achieved higher specificity and comparable sensitivity to those of the clinicians. Its consistent performance and substantial agreement with experts support its potential role in improving CE assessment in HD.

Gradual poisoning of a chest x-ray convolutional neural network with an adversarial attack and AI explainability methods.

Lee SB

pubmed logopapersJul 1 2025
Given artificial intelligence's transformative effects, studying safety is important to ensure it is implemented in a beneficial way. Convolutional neural networks are used in radiology research for prediction but can be corrupted through adversarial attacks. This study investigates the effect of an adversarial attack, through poisoned data. To improve generalizability, we create a generic ResNet pneumonia classification model and then use it as an example by subjecting it to BadNets adversarial attacks. The study uses various poisoned datasets of different compositions (2%, 16.7% and 100% ratios of poisoned data) and two different test sets (a normal set of test data and one that contained poisoned images) to study the effects of BadNets. To provide a visual effect of the progressing corruption of the models, SHapley Additive exPlanations (SHAP) were used. As corruption progressed, interval analysis revealed that performance on a valid test set decreased while the model learned to predict better on a poisoned test set. SHAP visualization showed focus on the trigger. In the 16.7% poisoned model, SHAP focus did not fixate on the trigger in the normal test set. Minimal effects were seen in the 2% model. SHAP visualization showed decreasing performance was correlated with increasing focus on the trigger. Corruption could potentially be masked in the 16.7% model unless subjected specifically to poisoned data. A minimum threshold for corruption may exist. The study demonstrates insights that can be further studied in future work and with future models. It also identifies areas of potential intervention for safeguarding models against adversarial attacks.

Deep learning based classification of tibio-femoral knee osteoarthritis from lateral view knee joint X-ray images.

Abdullah SS, Rajasekaran MP, Hossen MJ, Wong WK, Ng PK

pubmed logopapersJul 1 2025
Design an effective deep learning-driven method to locate and classify the tibio-femoral knee joint space width (JSW) with respect to both anterior-posterior (AP) and lateral views. Compare the results and see how successfully a deep learning approach can locate and classify tibio-femoral knee joint osteoarthritis from both anterior-posterior (AP) and lateral-view knee joint x-ray images. To evaluate the performance of a deep learning approach to classify and compare radiographic tibio-femoral knee joint osteoarthritis from both AP and lateral view knee joint digital X-ray images. We use 4334 data points (knee X-ray images) for this study. This paper introduces a methodology to locate, classify, and compare the outcomes of tibio-femoral knee joint osteoarthritis from both AP and lateral knee joint x-ray images. We have fine-tuned DenseNet 201 with transfer learning to extract the features to detect and classify tibio-femoral knee joint osteoarthritis from both AP view and lateral view knee joint X-ray images. The proposed model is compared with some classifiers. The proposed model locate the tibio femoral knee JSW localization accuracy at 98.12% (lateral view) and 99.32% (AP view). The classification accuracy with respect to the lateral view is 92.42% and the AP view is 98.57%, which indicates the performance of automatic detection and classification of tibio-femoral knee joint osteoarthritis with respect to both views (AP and lateral views).We represent the first automated deep learning approach to classify tibio-femoral osteoarthritis on both the AP view and the lateral view, respectively. The proposed deep learning approach trained on the femur and tibial bone regions from both AP view and lateral view digital X-ray images. The proposed model performs better at locating and classifying tibio femoral knee joint osteoarthritis than the existing approaches. The proposed approach will be helpful for the clinicians/medical experts to analyze the progression of tibio-femoral knee OA in different views. The proposed approach performs better in AP view than Lateral view. So, when compared to other continuing existing architectures/models, the proposed model offers exceptional outcomes with fine-tuning.

A hybrid XAI-driven deep learning framework for robust GI tract disease diagnosis.

Dahan F, Shah JH, Saleem R, Hasnain M, Afzal M, Alfakih TM

pubmed logopapersJul 1 2025
The stomach is one of the main digestive organs in the GIT, essential for digestion and nutrient absorption. However, various gastrointestinal diseases, including gastritis, ulcers, and cancer, affect health and quality of life severely. The precise diagnosis of gastrointestinal (GI) tract diseases is a significant challenge in the field of healthcare, as misclassification leads to late prescriptions and negative consequences for patients. Even with the advancement in machine learning and explainable AI for medical image analysis, existing methods tend to have high false negative rates which compromise critical disease cases. This paper presents a hybrid deep learning based explainable artificial intelligence (XAI) approach to improve the accuracy of gastrointestinal disorder diagnosis, including stomach diseases, from images acquired endoscopically. Swin Transformer with DCNN (EfficientNet-B3, ResNet-50) is integrated to improve both the accuracy of diagnostics and the interpretability of the model to extract robust features. Stacked machine learning classifiers with meta-loss and XAI techniques (Grad-CAM) are combined to minimize false negatives, which helps in early and accurate medical diagnoses in GI tract disease evaluation. The proposed model successfully achieved an accuracy of 93.79% with a lower misclassification rate, which is effective for gastrointestinal tract disease classification. Class-wise performance metrics, such as precision, recall, and F1-score, show considerable improvements with false-negative rates being reduced. AI-driven GI tract disease diagnosis becomes more accessible for medical professionals through Grad-CAM because it provides visual explanations about model predictions. This study makes the prospect of using a synergistic DL with XAI open for improvement towards early diagnosis with fewer human errors and also guiding doctors handling gastrointestinal diseases.

Knowledge Graph-Based Few-Shot Learning for Label of Medical Imaging Reports.

Li T, Zhang Y, Su D, Liu M, Ge M, Chen L, Li C, Tang J

pubmed logopapersJul 1 2025
The application of artificial intelligence (AI) in the field of automatic imaging report labeling faces the challenge of manually labeling large datasets. To propose a data augmentation method by using knowledge graph (KG) and few-shot learning. A KG of lumbar spine X-ray images was constructed, and 2000 data were annotated based on the KG, which were divided into training, validation, and test sets in a ratio of 7:2:1. The training dataset was augmented based on the synonym/replacement attributes of the KG and was the augmented data was input into the BERT (Bidirectional Encoder Representations from Transformers) model for automatic annotation training. The performance of the model under different augmentation ratios (1:10, 1:100, 1:1000) and augmentation methods (synonyms only, replacements only, combination of synonyms and replacements) was evaluated using the precision and F1 scores. In addition, with the augmentation ratio was fixed, iterative experiments were performed by supplementing the data of nodes that perform poorly in the validation set to further improve model's performance. Prior to data augmentation, the precision was 0.728 and the F1 score was 0.666. By adjusting the augmentation ratio, the precision increased from 0.912 at a 1:10 augmentation ratio to 0.932 at a 1:100 augmentation ratio (P<.05), while F1 score improved from 0.853 at a 1:10 augmentation ratio to 0.881 at a 1:100 augmentation ratio (P<.05). Additionally, the effectiveness of various augmentation methods was compared at a 1:100 augmentation ratio. The augmentation method that combined synonyms and replacements (F1=0.881) was superior to the methods that only used synonyms (F1=0.815) and only used replacements (F1=0.753) (P<.05). For nodes that exhibited suboptimal performance on the validation set, supplementing the training set with target data improved model performance, increasing the average F1 score to 0.979 (P<.05). Based on the KG, this study trained an automatic labeling model of radiology reports using a few-shot data set. This method effectively reduces the workload of manual labeling, improves the efficiency and accuracy of image data labeling, and provides an important research strategy for the application of AI in the domain of automatic labeling of image reports.
Page 27 of 46453 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.