Sort by:
Page 2 of 322 results

Reproducing and Improving CheXNet: Deep Learning for Chest X-ray Disease Classification

Daniel Strick, Carlos Garcia, Anthony Huang

arxiv logopreprintMay 10 2025
Deep learning for radiologic image analysis is a rapidly growing field in biomedical research and is likely to become a standard practice in modern medicine. On the publicly available NIH ChestX-ray14 dataset, containing X-ray images that are classified by the presence or absence of 14 different diseases, we reproduced an algorithm known as CheXNet, as well as explored other algorithms that outperform CheXNet's baseline metrics. Model performance was primarily evaluated using the F1 score and AUC-ROC, both of which are critical metrics for imbalanced, multi-label classification tasks in medical imaging. The best model achieved an average AUC-ROC score of 0.85 and an average F1 score of 0.39 across all 14 disease classifications present in the dataset.

Predicting Knee Osteoarthritis Severity from Radiographic Predictors: Data from the Osteoarthritis Initiative.

Nurmirinta TAT, Turunen MJ, Tohka J, Mononen ME, Liukkonen MK

pubmed logopapersMay 9 2025
In knee osteoarthritis (KOA) treatment, preventive measures to reduce its onset risk are a key factor. Among individuals with radiographically healthy knees, however, future knee joint integrity and condition cannot be predicted by clinically applicable methods. We investigated if knee joint morphology derived from widely accessible and cost-effective radiographs could be helpful in predicting future knee joint integrity and condition. We combined knee joint morphology with known risk predictors such as age, height, and weight. Baseline data were utilized as predictors, and the maximal severity of KOA after 8 years served as a target variable. The three KOA categories in this study were based on Kellgren-Lawrence grading: healthy, moderate, and severe. We employed a two-stage machine learning model that utilized two random forest algorithms. We trained three models: the subject demographics (SD) model utilized only SD; the image model utilized only knee joint morphology from radiographs; the merged model utilized combined predictors. The training data comprised an 8-year follow-up of 1222 knees from 683 individuals. The SD- model obtained a weighted F1 score (WF1) of 77.2% and a balanced accuracy (BA) of 65.6%. The Image-model performance metrics were lowest, with a WF1 of 76.5% and BA of 63.8%. The top-performing merged model achieved a WF1 score of 78.3% and a BA of 68.2%. Our two-stage prediction model provided improved results based on performance metrics, suggesting potential for application in clinical settings.

Towards Better Cephalometric Landmark Detection with Diffusion Data Generation

Dongqian Guo, Wencheng Han, Pang Lyu, Yuxi Zhou, Jianbing Shen

arxiv logopreprintMay 9 2025
Cephalometric landmark detection is essential for orthodontic diagnostics and treatment planning. Nevertheless, the scarcity of samples in data collection and the extensive effort required for manual annotation have significantly impeded the availability of diverse datasets. This limitation has restricted the effectiveness of deep learning-based detection methods, particularly those based on large-scale vision models. To address these challenges, we have developed an innovative data generation method capable of producing diverse cephalometric X-ray images along with corresponding annotations without human intervention. To achieve this, our approach initiates by constructing new cephalometric landmark annotations using anatomical priors. Then, we employ a diffusion-based generator to create realistic X-ray images that correspond closely with these annotations. To achieve precise control in producing samples with different attributes, we introduce a novel prompt cephalometric X-ray image dataset. This dataset includes real cephalometric X-ray images and detailed medical text prompts describing the images. By leveraging these detailed prompts, our method improves the generation process to control different styles and attributes. Facilitated by the large, diverse generated data, we introduce large-scale vision detection models into the cephalometric landmark detection task to improve accuracy. Experimental results demonstrate that training with the generated data substantially enhances the performance. Compared to methods without using the generated data, our approach improves the Success Detection Rate (SDR) by 6.5%, attaining a notable 82.2%. All code and data are available at: https://um-lab.github.io/cepha-generation

APD-FFNet: A Novel Explainable Deep Feature Fusion Network for Automated Periodontitis Diagnosis on Dental Panoramic Radiography.

Resul ES, Senirkentli GB, Bostanci E, Oduncuoglu BF

pubmed logopapersMay 9 2025
This study introduces APD-FFNet, a novel, explainable deep learning architecture for automated periodontitis diagnosis using panoramic radiographs. A total of 337 panoramic radiographs, annotated by a periodontist, served as the dataset. APD-FFNet combines custom convolutional and transformer-based layers within a deep feature fusion framework that captures both local and global contextual features. Performance was evaluated using accuracy, the F1 score, the area under the receiver operating characteristic curve, the Jaccard similarity coefficient, and the Matthews correlation coefficient. McNemar's test confirmed statistical significance, and SHapley Additive exPlanations provided interpretability insights. APD-FFNet achieved 94% accuracy, a 93.88% F1 score, 93.47% area under the receiver operating characteristic curve, 88.47% Jaccard similarity coefficient, and 88.46% Matthews correlation coefficient, surpassing comparable approaches. McNemar's test validated these findings (p < 0.05). Explanations generated by SHapley Additive exPlanations highlighted important regions in each radiograph, supporting clinical applicability. By merging convolutional and transformer-based layers, APD-FFNet establishes a new benchmark in automated, interpretable periodontitis diagnosis, with low hyperparameter sensitivity facilitating its integration into regular dental practice. Its adaptable design suggests broader relevance to other medical imaging domains. This is the first feature fusion method specifically devised for periodontitis diagnosis, supported by an expert-curated dataset and advanced explainable artificial intelligence. Its robust accuracy, low hyperparameter sensitivity, and transparent outputs set a new standard for automated periodontal analysis.

Deep learning approach based on a patch residual for pediatric supracondylar subtle fracture detection.

Ye Q, Wang Z, Lou Y, Yang Y, Hou J, Liu Z, Liu W, Li J

pubmed logopapersMay 8 2025
Supracondylar humerus fractures in children are among the most common elbow fractures in pediatrics. However, their diagnosis can be particularly challenging due to the anatomical characteristics and imaging features of the pediatric skeleton. In recent years, convolutional neural networks (CNNs) have achieved notable success in medical image analysis, though their performance typically relies on large-scale, high-quality labeled datasets. Unfortunately, labeled samples for pediatric supracondylar fractures are scarce and difficult to obtain. To address this issue, this paper introduces a deep learning-based multi-scale patch residual network (MPR) for the automatic detection and localization of subtle pediatric supracondylar fractures. The MPR framework combines a CNN for automatic feature extraction with a multi-scale generative adversarial network to model skeletal integrity using healthy samples. By leveraging healthy images to learn the normal skeletal distribution, the approach reduces the dependency on labeled fracture data and effectively addresses the challenges posed by limited pediatric datasets. Datasets from two different hospitals were used, with data augmentation techniques applied during both training and validation. On an independent test set, the proposed model achieves an accuracy of 90.5%, with 89% sensitivity, 92% specificity, and an F1 score of 0.906-outperforming the diagnostic accuracy of emergency medicine physicians and approaching that of pediatric radiologists. Furthermore, the model demonstrates a fast inference speed of 1.1 s per sheet, underscoring its substantial potential for clinical application.

Machine learning-based approaches for distinguishing viral and bacterial pneumonia in paediatrics: A scoping review.

Rickard D, Kabir MA, Homaira N

pubmed logopapersMay 8 2025
Pneumonia is the leading cause of hospitalisation and mortality among children under five, particularly in low-resource settings. Accurate differentiation between viral and bacterial pneumonia is essential for guiding appropriate treatment, yet it remains challenging due to overlapping clinical and radiographic features. Advances in machine learning (ML), particularly deep learning (DL), have shown promise in classifying pneumonia using chest X-ray (CXR) images. This scoping review summarises the evidence on ML techniques for classifying viral and bacterial pneumonia using CXR images in paediatric patients. This scoping review was conducted following the Joanna Briggs Institute methodology and the PRISMA-ScR guidelines. A comprehensive search was performed in PubMed, Embase, and Scopus to identify studies involving children (0-18 years) with pneumonia diagnosed through CXR, using ML models for binary or multiclass classification. Data extraction included ML models, dataset characteristics, and performance metrics. A total of 35 studies, published between 2018 and 2025, were included in this review. Of these, 31 studies used the publicly available Kermany dataset, raising concerns about overfitting and limited generalisability to broader, real-world clinical populations. Most studies (n=33) used convolutional neural networks (CNNs) for pneumonia classification. While many models demonstrated promising performance, significant variability was observed due to differences in methodologies, dataset sizes, and validation strategies, complicating direct comparisons. For binary classification (viral vs bacterial pneumonia), a median accuracy of 92.3% (range: 80.8% to 97.9%) was reported. For multiclass classification (healthy, viral pneumonia, and bacterial pneumonia), the median accuracy was 91.8% (range: 76.8% to 99.7%). Current evidence is constrained by a predominant reliance on a single dataset and variability in methodologies, which limit the generalisability and clinical applicability of findings. To address these limitations, future research should focus on developing diverse and representative datasets while adhering to standardised reporting guidelines. Such efforts are essential to improve the reliability, reproducibility, and translational potential of machine learning models in clinical settings.

An automated hip fracture detection, classification system on pelvic radiographs and comparison with 35 clinicians.

Yilmaz A, Gem K, Kalebasi M, Varol R, Gencoglan ZO, Samoylenko Y, Tosyali HK, Okcu G, Uvet H

pubmed logopapersMay 8 2025
Accurate diagnosis of orthopedic injuries, especially pelvic and hip fractures, is vital in trauma management. While pelvic radiographs (PXRs) are widely used, misdiagnosis is common. This study proposes an automated system that uses convolutional neural networks (CNNs) to detect potential fracture areas and predict fracture conditions, aiming to outperform traditional object detection-based systems. We developed two deep learning models for hip fracture detection and prediction, trained on PXRs from three hospitals. The first model utilized automated hip area detection, cropping, and classification of the resulting patches. The images were preprocessed using the Contrast Limited Adaptive Histogram Equalization (CLAHE) algorithm. The YOLOv5 architecture was employed for the object detection model, while three different pre-trained deep neural network (DNN) architectures were used for classification, applying transfer learning. Their performance was evaluated on a test dataset, and compared with 35 clinicians. YOLOv5 achieved a 92.66% accuracy on regular images and 88.89% on CLAHE-enhanced images. The classifier models, MobileNetV2, Xception, and InceptionResNetV2, achieved accuracies between 94.66% and 97.67%. In contrast, the clinicians demonstrated a mean accuracy of 84.53% and longer prediction durations. The DNN models showed significantly better accuracy and speed compared to human evaluators (p < 0.0005, p < 0.01). These DNN models highlight promising utility in trauma diagnosis due to their high accuracy and speed. Integrating such systems into clinical practices may enhance the diagnostic efficiency of PXRs.

Chest X-Ray Visual Saliency Modeling: Eye-Tracking Dataset and Saliency Prediction Model.

Lou J, Wang H, Wu X, Ng JCH, White R, Thakoor KA, Corcoran P, Chen Y, Liu H

pubmed logopapersMay 8 2025
Radiologists' eye movements during medical image interpretation reflect their perceptual-cognitive processes of diagnostic decisions. The eye movement data can be modeled to represent clinically relevant regions in a medical image and potentially integrated into an artificial intelligence (AI) system for automatic diagnosis in medical imaging. In this article, we first conduct a large-scale eye-tracking study involving 13 radiologists interpreting 191 chest X-ray (CXR) images, establishing a best-of-its-kind CXR visual saliency benchmark. We then perform analysis to quantify the reliability and clinical relevance of saliency maps (SMs) generated for CXR images. We develop CXR image saliency prediction method (CXRSalNet), a novel saliency prediction model that leverages radiologists' gaze information to optimize the use of unlabeled CXR images, enhancing training and mitigating data scarcity. We also demonstrate the application of our CXR saliency model in enhancing the performance of AI-powered diagnostic imaging systems.

Early budget impact analysis of AI to support the review of radiographic examinations for suspected fractures in NHS emergency departments (ED).

Gregory L, Boodhna T, Storey M, Shelmerdine S, Novak A, Lowe D, Harvey H

pubmed logopapersMay 7 2025
To develop an early budget impact analysis of and inform future research on the national adoption of a commercially available AI application to support clinicians reviewing radiographs for suspected fractures across NHS emergency departments in England. A decision tree framework was coded to assess a change in outcomes for suspected fractures in adults when AI fracture detection was integrated into clinical workflow over a 1-year time horizon. Standard of care was the comparator scenario and the ground truth reference cases were characterised by radiology report findings. The effect of AI on assisting ED clinicians when detecting fractures was sourced from US literature. Data on resource use conditioned on the correct identification of a fracture in the ED was extracted from a London NHS trust. Sensitivity analysis was conducted to account for the influence of parameter uncertainty on results. In one year, an estimated 658,564 radiographs were performed in emergency departments across England for suspected wrist, ankle or hip fractures. The number of patients returning to the ED with a missed fracture was reduced by 21,674 cases and a reduction of 20, 916 unnecessary referrals to fracture clinics. The cost of current practice was estimated at £66,646,542 and £63,012,150 with the integration of AI. Overall, generating a return on investment of £3,634,392 to the NHS. The adoption of AI in EDs across England has the potential to generate cost savings. However, additional evidence on radiograph review accuracy and subsequent resource use is required to further demonstrate this.

OA-HybridCNN (OHC): An advanced deep learning fusion model for enhanced diagnostic accuracy in knee osteoarthritis imaging.

Liao Y, Yang G, Pan W, Lu Y

pubmed logopapersJan 1 2025
Knee osteoarthritis (KOA) is a leading cause of disability globally. Early and accurate diagnosis is paramount in preventing its progression and improving patients' quality of life. However, the inconsistency in radiologists' expertise and the onset of visual fatigue during prolonged image analysis often compromise diagnostic accuracy, highlighting the need for automated diagnostic solutions. In this study, we present an advanced deep learning model, OA-HybridCNN (OHC), which integrates ResNet and DenseNet architectures. This integration effectively addresses the gradient vanishing issue in DenseNet and augments prediction accuracy. To evaluate its performance, we conducted a thorough comparison with other deep learning models using five-fold cross-validation and external tests. The OHC model outperformed its counterparts across all performance metrics. In external testing, OHC exhibited an accuracy of 91.77%, precision of 92.34%, and recall of 91.36%. During the five-fold cross-validation, its average AUC and ACC were 86.34% and 87.42%, respectively. Deep learning, particularly exemplified by the OHC model, has greatly improved the efficiency and accuracy of KOA imaging diagnosis. The adoption of such technologies not only alleviates the burden on radiologists but also significantly enhances diagnostic precision.
Page 2 of 322 results
Show
per page
Get Started

Upload your X-ray image and get interpretation.

Upload now →

Disclaimer: X-ray Interpreter's AI-generated results are for informational purposes only and not a substitute for professional medical advice. Always consult a healthcare professional for medical diagnosis and treatment.