Sort by:
Page 31 of 35341 results

Advanced feature fusion of radiomics and deep learning for accurate detection of wrist fractures on X-ray images.

Saadh MJ, Hussain QM, Albadr RJ, Doshi H, Rekha MM, Kundlas M, Pal A, Rizaev J, Taher WM, Alwan M, Jawad MJ, Al-Nuaimi AMA, Farhood B

pubmed logopapersMay 20 2025
The aim of this study was to develop a hybrid diagnostic framework integrating radiomic and deep features for accurate and reproducible detection and classification of wrist fractures using X-ray images. A total of 3,537 X-ray images, including 1,871 fracture and 1,666 non-fracture cases, were collected from three healthcare centers. Radiomic features were extracted using the PyRadiomics library, and deep features were derived from the bottleneck layer of an autoencoder. Both feature modalities underwent reliability assessment via Intraclass Correlation Coefficient (ICC) and cosine similarity. Feature selection methods, including ANOVA, Mutual Information (MI), Principal Component Analysis (PCA), and Recursive Feature Elimination (RFE), were applied to optimize the feature set. Classifiers such as XGBoost, CatBoost, Random Forest, and a Voting Classifier were used to evaluate diagnostic performance. The dataset was divided into training (70%) and testing (30%) sets, and metrics such as accuracy, sensitivity, and AUC-ROC were used for evaluation. The combined radiomic and deep feature approach consistently outperformed standalone methods. The Voting Classifier paired with MI achieved the highest performance, with a test accuracy of 95%, sensitivity of 94%, and AUC-ROC of 96%. The end-to-end model achieved competitive results with an accuracy of 93% and AUC-ROC of 94%. SHAP analysis and t-SNE visualizations confirmed the interpretability and robustness of the selected features. This hybrid framework demonstrates the potential for integrating radiomic and deep features to enhance diagnostic performance for wrist and forearm fractures, providing a reliable and interpretable solution suitable for clinical applications.

Feasibility of an AI-driven Classification of Tuberous Breast Deformity: A Siamese Network Approach with a Continuous Tuberosity Score.

Vaccari S, Paderno A, Furlan S, Cavallero MF, Lupacchini AM, Di Giuli R, Klinger M, Klinger F, Vinci V

pubmed logopapersMay 20 2025
Tuberous breast deformity (TBD) is a congenital condition characterized by constriction of the breast base, parenchymal hypoplasia, and areolar herniation. The absence of a universally accepted classification system complicates diagnosis and surgical planning, leading to variability in clinical outcomes. Artificial intelligence (AI) has emerged as a powerful adjunct in medical imaging, enabling objective, reproducible, and data-driven diagnostic assessments. This study introduces an AI-driven diagnostic tool for tuberous breast deformity (TBD) classification using a Siamese Network trained on paired frontal and lateral images. Additionally, the model generates a continuous Tuberosity Score (ranging from 0 to 1) based on embedding vector distances, offering an objective measure to enhance surgical planning and improved clinical outcomes. A dataset of 200 expertly classified frontal and lateral breast images (100 tuberous, 100 non-tuberous) was used to train a Siamese Network with contrastive loss. The model extracted high-dimensional feature embeddings to differentiate tuberous from non-tuberous breasts. Five-fold cross-validation ensured robust performance evaluation. Performance metrics included accuracy, precision, recall, and F1-score. Visualization techniques, such as t-SNE clustering and occlusion sensitivity mapping, were employed to interpret model decisions. The model achieved an average accuracy of 96.2% ± 5.5%, with balanced precision and recall. The Tuberosity Score, derived from the Euclidean distance between embeddings, provided a continuous measure of deformity severity, correlating well with clinical assessments. This AI-based framework offers an objective, high-accuracy classification system for TBD. The Tuberosity Score enhances diagnostic precision, potentially aiding in surgical planning and improving patient outcomes.

Accuracy of segment anything model for classification of vascular stenosis in digital subtraction angiography.

Navasardyan V, Katz M, Goertz L, Zohranyan V, Navasardyan H, Shahzadi I, Kröger JR, Borggrefe J

pubmed logopapersMay 19 2025
This retrospective study evaluates the diagnostic performance of an optimized comprehensive multi-stage framework based on the Segment Anything Model (SAM), which we named Dr-SAM, for detecting and grading vascular stenosis in the abdominal aorta and iliac arteries using digital subtraction angiography (DSA). A total of 100 DSA examinations were conducted on 100 patients. The infrarenal abdominal aorta (AAI), common iliac arteries (CIA), and external iliac arteries (EIA) were independently evaluated by two experienced radiologists using a standardized 5-point grading scale. Dr-SAM analyzed the same DSA images, and its assessments were compared with the average stenosis grading provided by the radiologists. Diagnostic accuracy was evaluated using Cohen's kappa, specificity, sensitivity, and Wilcoxon signed-rank tests. Interobserver agreement between radiologists, which established the reference standard, was strong (Cohen's kappa: CIA right = 0.95, CIA left = 0.94, EIA right = 0.98, EIA left = 0.98, AAI = 0.79). Dr-SAM showed high agreement with radiologist consensus for CIA (κ = 0.93 right, 0.91 left), moderate agreement for EIA (κ = 0.79 right, 0.76 left), and fair agreement for AAI (κ = 0.70). Dr-SAM demonstrated excellent specificity (up to 1.0) and robust sensitivity (0.67-0.83). Wilcoxon tests revealed no significant differences between Dr-SAM and radiologist grading (p > 0.05). Dr-SAM proved to be an accurate and efficient tool for vascular assessment, with the potential to streamline diagnostic workflows and reduce variability in stenosis grading. Its ability to deliver rapid and consistent evaluations may contribute to earlier detection of disease and the optimization of treatment strategies. Further studies are needed to confirm these findings in prospective settings and to enhance its capabilities, particularly in the detection of occlusions.

Detection of carotid artery calcifications using artificial intelligence in dental radiographs: a systematic review and meta-analysis.

Arzani S, Soltani P, Karimi A, Yazdi M, Ayoub A, Khurshid Z, Galderisi D, Devlin H

pubmed logopapersMay 19 2025
Carotid artery calcifications are important markers of cardiovascular health, often associated with atherosclerosis and a higher risk of stroke. Recent research shows that dental radiographs can help identify these calcifications, allowing for earlier detection of vascular diseases. Advances in artificial intelligence (AI) have improved the ability to detect carotid calcifications in dental images, making it a useful screening tool. This systematic review and meta-analysis aimed to evaluate how accurately AI methods can identify carotid calcifications in dental radiographs. A systematic search in databases including PubMed, Scopus, Embase, and Web of Science for studies on AI algorithms used to detect carotid calcifications in dental radiographs was conducted. Two independent reviewers collected data on study aims, imaging techniques, and statistical measures such as sensitivity and specificity. A meta-analysis using random effects was performed, and the risk of bias was evaluated with the QUADAS-2 tool. Nine studies were suitable for qualitative analysis, while five provided data for quantitative analysis. These studies assessed AI algorithms using cone beam computed tomography (n = 3) and panoramic radiographs (n = 6). The sensitivity of the included studies ranged from 0.67 to 0.98 and specificity varied between 0.85 and 0.99. The overall effect size, by considering only one AI method in each study, resulted in a sensitivity of 0.92 [95% CI 0.81 to 0.97] and a specificity of 0.96 [95% CI 0.92 to 0.97]. The high sensitivity and specificity indicate that AI methods could be effective screening tools, enhancing the early detection of stroke and related cardiovascular risks. Not applicable.

CorBenchX: Large-Scale Chest X-Ray Error Dataset and Vision-Language Model Benchmark for Report Error Correction

Jing Zou, Qingqiu Li, Chenyu Lian, Lihao Liu, Xiaohan Yan, Shujun Wang, Jing Qin

arxiv logopreprintMay 17 2025
AI-driven models have shown great promise in detecting errors in radiology reports, yet the field lacks a unified benchmark for rigorous evaluation of error detection and further correction. To address this gap, we introduce CorBenchX, a comprehensive suite for automated error detection and correction in chest X-ray reports, designed to advance AI-assisted quality control in clinical practice. We first synthesize a large-scale dataset of 26,326 chest X-ray error reports by injecting clinically common errors via prompting DeepSeek-R1, with each corrupted report paired with its original text, error type, and human-readable description. Leveraging this dataset, we benchmark both open- and closed-source vision-language models,(e.g., InternVL, Qwen-VL, GPT-4o, o4-mini, and Claude-3.7) for error detection and correction under zero-shot prompting. Among these models, o4-mini achieves the best performance, with 50.6 % detection accuracy and correction scores of BLEU 0.853, ROUGE 0.924, BERTScore 0.981, SembScore 0.865, and CheXbertF1 0.954, remaining below clinical-level accuracy, highlighting the challenge of precise report correction. To advance the state of the art, we propose a multi-step reinforcement learning (MSRL) framework that optimizes a multi-objective reward combining format compliance, error-type accuracy, and BLEU similarity. We apply MSRL to QwenVL2.5-7B, the top open-source model in our benchmark, achieving an improvement of 38.3% in single-error detection precision and 5.2% in single-error correction over the zero-shot baseline.

Prediction of cervical spondylotic myelopathy from a plain radiograph using deep learning with convolutional neural networks.

Tachi H, Kokabu T, Suzuki H, Ishikawa Y, Yabu A, Yanagihashi Y, Hyakumachi T, Shimizu T, Endo T, Ohnishi T, Ukeba D, Sudo H, Yamada K, Iwasaki N

pubmed logopapersMay 17 2025
This study aimed to develop deep learning algorithms (DLAs) utilising convolutional neural networks (CNNs) to classify cervical spondylotic myelopathy (CSM) and cervical spondylotic radiculopathy (CSR) from plain cervical spine radiographs. Data from 300 patients (150 with CSM and 150 with CSR) were used for internal validation (IV) using five-fold cross-validation strategy. Additionally, 100 patients (50 with CSM and 50 with CSR) were included in the external validation (EV). Two DLAs were trained using CNNs on plain radiographs from C3-C6 for the binary classification of CSM and CSR, and for the prediction of the spinal canal area rate using magnetic resonance imaging. Model performance was evaluated on external data using metrics such as area under the curve (AUC), accuracy, and likelihood ratios. For the binary classification, the AUC ranged from 0.84 to 0.96, with accuracy between 78% and 95% during IV. In the EV, the AUC and accuracy were 0.96 and 90%, respectively. For the spinal canal area rate, correlation coefficients during five-fold cross-validation ranged from 0.57 to 0.64, with a mean correlation of 0.61 observed in the EV. DLAs developed with CNNs demonstrated promising accuracy for classifying CSM and CSR from plain radiographs. These algorithms have the potential to assist non-specialists in identifying patients who require further evaluation or referral to spine specialists, thereby reducing delays in the diagnosis and treatment of CSM.

CheX-DS: Improving Chest X-ray Image Classification with Ensemble Learning Based on DenseNet and Swin Transformer

Xinran Li, Yu Liu, Xiujuan Xu, Xiaowei Zhao

arxiv logopreprintMay 16 2025
The automatic diagnosis of chest diseases is a popular and challenging task. Most current methods are based on convolutional neural networks (CNNs), which focus on local features while neglecting global features. Recently, self-attention mechanisms have been introduced into the field of computer vision, demonstrating superior performance. Therefore, this paper proposes an effective model, CheX-DS, for classifying long-tail multi-label data in the medical field of chest X-rays. The model is based on the excellent CNN model DenseNet for medical imaging and the newly popular Swin Transformer model, utilizing ensemble deep learning techniques to combine the two models and leverage the advantages of both CNNs and Transformers. The loss function of CheX-DS combines weighted binary cross-entropy loss with asymmetric loss, effectively addressing the issue of data imbalance. The NIH ChestX-ray14 dataset is selected to evaluate the model's effectiveness. The model outperforms previous studies with an excellent average AUC score of 83.76\%, demonstrating its superior performance.

Impact of test set composition on AI performance in pediatric wrist fracture detection in X-rays.

Till T, Scherkl M, Stranger N, Singer G, Hankel S, Flucher C, Hržić F, Štajduhar I, Tschauner S

pubmed logopapersMay 16 2025
To evaluate how different test set sampling strategies-random selection and balanced sampling-affect the performance of artificial intelligence (AI) models in pediatric wrist fracture detection using radiographs, aiming to highlight the need for standardization in test set design. This retrospective study utilized the open-sourced GRAZPEDWRI-DX dataset of 6091 pediatric wrist radiographs. Two test sets, each containing 4588 images, were constructed: one using a balanced approach based on case difficulty, projection type, and fracture presence and the other a random selection. EfficientNet and YOLOv11 models were trained and validated on 18,762 radiographs and tested on both sets. Binary classification and object detection tasks were evaluated using metrics such as precision, recall, F1 score, AP50, and AP50-95. Statistical comparisons between test sets were performed using nonparametric tests. Performance metrics significantly decreased in the balanced test set with more challenging cases. For example, the precision for YOLOv11 models decreased from 0.95 in the random set to 0.83 in the balanced set. Similar trends were observed for recall, accuracy, and F1 score, indicating that models trained on easy-to-recognize cases performed poorly on more complex ones. These results were consistent across all model variants tested. AI models for pediatric wrist fracture detection exhibit reduced performance when tested on balanced datasets containing more difficult cases, compared to randomly selected cases. This highlights the importance of constructing representative and standardized test sets that account for clinical complexity to ensure robust AI performance in real-world settings. Question Do different sampling strategies based on samples' complexity have an influence in deep learning models' performance in fracture detection? Findings AI performance in pediatric wrist fracture detection significantly drops when tested on balanced datasets with more challenging cases, compared to randomly selected cases. Clinical relevance Without standardized and validated test datasets for AI that reflect clinical complexities, performance metrics may be overestimated, limiting the utility of AI in real-world settings.

From Embeddings to Accuracy: Comparing Foundation Models for Radiographic Classification

Xue Li, Jameson Merkow, Noel C. F. Codella, Alberto Santamaria-Pang, Naiteek Sangani, Alexander Ersoy, Christopher Burt, John W. Garrett, Richard J. Bruce, Joshua D. Warner, Tyler Bradshaw, Ivan Tarapov, Matthew P. Lungren, Alan B. McMillan

arxiv logopreprintMay 16 2025
Foundation models, pretrained on extensive datasets, have significantly advanced machine learning by providing robust and transferable embeddings applicable to various domains, including medical imaging diagnostics. This study evaluates the utility of embeddings derived from both general-purpose and medical domain-specific foundation models for training lightweight adapter models in multi-class radiography classification, focusing specifically on tube placement assessment. A dataset comprising 8842 radiographs classified into seven distinct categories was employed to extract embeddings using six foundation models: DenseNet121, BiomedCLIP, Med-Flamingo, MedImageInsight, Rad-DINO, and CXR-Foundation. Adapter models were subsequently trained using classical machine learning algorithms. Among these combinations, MedImageInsight embeddings paired with an support vector machine adapter yielded the highest mean area under the curve (mAUC) at 93.8%, followed closely by Rad-DINO (91.1%) and CXR-Foundation (89.0%). In comparison, BiomedCLIP and DenseNet121 exhibited moderate performance with mAUC scores of 83.0% and 81.8%, respectively, whereas Med-Flamingo delivered the lowest performance at 75.1%. Notably, most adapter models demonstrated computational efficiency, achieving training within one minute and inference within seconds on CPU, underscoring their practicality for clinical applications. Furthermore, fairness analyses on adapters trained on MedImageInsight-derived embeddings indicated minimal disparities, with gender differences in performance within 2% and standard deviations across age groups not exceeding 3%. These findings confirm that foundation model embeddings-especially those from MedImageInsight-facilitate accurate, computationally efficient, and equitable diagnostic classification using lightweight adapters for radiographic image analysis.

Artificial intelligence in dentistry: awareness among dentists and computer scientists.

Costa ED, Vieira MA, Ambrosano GMB, Gaêta-Araujo H, Carneiro JA, Zancan BAG, Scaranti A, Macedo AA, Tirapelli C

pubmed logopapersMay 16 2025
For clinical application of artificial intelligence (AI) in dentistry, collaboration with computer scientists is necessary. This study aims to evaluate the knowledge of dentists and computer scientists regarding the utilization of AI in dentistry, especially in dentomaxillofacial radiology. 610 participants (374 dentists and 236 computer scientists) took part in a survey about AI in dentistry and radiographic imaging. Response options contained Likert scale of agreement/disagreement. Descriptive analyses of agreement scores were performed using quartiles (minimum value, first quartile, median, third quartile, and maximum value). Non-parametric Mann-Whitney test was used to compare response scores between two categories (α = 5%). Dentists academics had higher agreement scores for the questions: "knowing the applications of AI in dentistry", "dentists taking the lead in AI research", "AI education should be part of teaching", "AI can increase the price of dental services", "AI can lead to errors in radiographic diagnosis", "AI can negatively interfere with the choice of Radiology specialty", "AI can cause a reduction in the employment of radiologists", "patient data can be hacked using AI" (p < 0.05). Computer scientists had higher concordance scores for the questions "having knowledge in AI" and "AI's potential to speed up and improve radiographic diagnosis". Although dentists acknowledge the potential benefits of AI in dentistry, they remain skeptical about its use and consider it important to integrate the topic of AI into dental education curriculum. On the other hand, computer scientists confirm technical expertise in AI and recognize its potential in dentomaxillofacial radiology.
Page 31 of 35341 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.