Sort by:
Page 23 of 35341 results

A fully open AI foundation model applied to chest radiography.

Ma D, Pang J, Gotway MB, Liang J

pubmed logopapersJun 11 2025
Chest radiography frequently serves as baseline imaging for most lung diseases<sup>1</sup>. Deep learning has great potential for automating the interpretation of chest radiography<sup>2</sup>. However, existing chest radiographic deep learning models are limited in diagnostic scope, generalizability, adaptability, robustness and extensibility. To overcome these limitations, we have developed Ark<sup>+</sup>, a foundation model applied to chest radiography and pretrained by cyclically accruing and reusing the knowledge from heterogeneous expert labels in numerous datasets. Ark<sup>+</sup> excels in diagnosing thoracic diseases. It expands the diagnostic scope and addresses potential misdiagnosis. It can adapt to evolving diagnostic needs and respond to novel diseases. It can learn rare conditions from a few samples and transfer to new diagnostic settings without training. It tolerates data biases and long-tailed distributions, and it supports federated learning to preserve privacy. All codes and pretrained models have been released, so that Ark<sup>+</sup> is open for fine-tuning, local adaptation and improvement. It is extensible to several modalities. Thus, it is a foundation model for medical imaging. The exceptional capabilities of Ark<sup>+</sup> stem from our insight: aggregating various datasets diversifies the patient populations and accrues knowledge from many experts to yield unprecedented performance while reducing annotation costs<sup>3</sup>. The development of Ark<sup>+</sup> reveals that open models trained by accruing and reusing knowledge from heterogeneous expert annotations with a multitude of public (big or small) datasets can surpass the performance of proprietary models trained on large data. We hope that our findings will inspire more researchers to share code and datasets or federate privacy-preserving data to create open foundation models with diverse, global expertise and patient populations, thus accelerating open science and democratizing AI for medicine.

Comparative accuracy of two commercial AI algorithms for musculoskeletal trauma detection in emergency radiographs.

Huhtanen JT, Nyman M, Blanco Sequeiros R, Koskinen SK, Pudas TK, Kajander S, Niemi P, Aronen HJ, Hirvonen J

pubmed logopapersJun 9 2025
Missed fractures are the primary cause of interpretation errors in emergency radiology, and artificial intelligence has recently shown great promise in radiograph interpretation. This study compared the diagnostic performance of two AI algorithms, BoneView and RBfracture, in detecting traumatic abnormalities (fractures and dislocations) in MSK radiographs. AI algorithms analyzed 998 radiographs (585 normal, 413 abnormal), against the consensus of two MSK specialists. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and interobserver agreement (Cohen's Kappa) were calculated. 95% confidence intervals (CI) assessed robustness, and McNemar's tests compared sensitivity and specificity between the AI algorithms. BoneView demonstrated a sensitivity of 0.893 (95% CI: 0.860-0.920), specificity of 0.885 (95% CI: 0.857-0.909), PPV of 0.846, NPV of 0.922, and accuracy of 0.889. RBfracture demonstrated a sensitivity of 0.872 (95% CI: 0.836-0.901), specificity of 0.892 (95% CI: 0.865-0.915), PPV of 0.851, NPV of 0.908, and accuracy of 0.884. No statistically significant differences were found in sensitivity (p = 0.151) or specificity (p = 0.708). Kappa was 0.81 (95% CI: 0.77-0.84), indicating almost perfect agreement between the two AI algorithms. Performance was similar in adults and children. Both AI algorithms struggled more with subtle abnormalities, which constituted 66% and 70% of false negatives but only 20% and 18% of true positives for the two AI algorithms, respectively (p < 0.001). BoneView and RBfracture exhibited high diagnostic performance and almost perfect agreement, with consistent results across adults and children, highlighting the potential of AI in emergency radiograph interpretation.

Improving Patient Communication by Simplifying AI-Generated Dental Radiology Reports With ChatGPT: Comparative Study.

Stephan D, Bertsch AS, Schumacher S, Puladi B, Burwinkel M, Al-Nawas B, Kämmerer PW, Thiem DG

pubmed logopapersJun 9 2025
Medical reports, particularly radiology findings, are often written for professional communication, making them difficult for patients to understand. This communication barrier can reduce patient engagement and lead to misinterpretation. Artificial intelligence (AI), especially large language models such as ChatGPT, offers new opportunities for simplifying medical documentation to improve patient comprehension. We aimed to evaluate whether AI-generated radiology reports simplified by ChatGPT improve patient understanding, readability, and communication quality compared to original AI-generated reports. In total, 3 versions of radiology reports were created using ChatGPT: an original AI-generated version (text 1), a patient-friendly, simplified version (text 2), and a further simplified and accessibility-optimized version (text 3). A total of 300 patients (n=100, 33.3% per group), excluding patients with medical education, were randomly assigned to review one text version and complete a standardized questionnaire. Readability was assessed using the Flesch Reading Ease (FRE) score and LIX indices. Both simplified texts showed significantly higher readability scores (text 1: FRE score=51.1; text 2: FRE score=55.0; and text 3: FRE score=56.4; P<.001) and lower LIX scores, indicating enhanced clarity. Text 3 had the shortest sentences, had the fewest long words, and scored best on all patient-rated dimensions. Questionnaire results revealed significantly higher ratings for texts 2 and 3 across clarity (P<.001), tone (P<.001), structure, and patient engagement. For example, patients rated the ability to understand findings without help highest for text 3 (mean 1.5, SD 0.7) and lowest for text 1 (mean 3.1, SD 1.4). Both simplified texts significantly improved patients' ability to prepare for clinical conversations and promoted shared decision-making. AI-generated simplification of radiology reports significantly enhances patient comprehension and engagement. These findings highlight the potential of ChatGPT as a tool to improve patient-centered communication. While promising, future research should focus on ensuring clinical accuracy and exploring applications across diverse patient populations to support equitable and effective integration of AI in health care communication.

Advancing respiratory disease diagnosis: A deep learning and vision transformer-based approach with a novel X-ray dataset.

Alghadhban A, Ramadan RA, Alazmi M

pubmed logopapersJun 9 2025
With the increasing prevalence of respiratory diseases such as pneumonia and COVID-19, timely and accurate diagnosis is critical. This paper makes significant contributions to the field of respiratory disease classification by utilizing X-ray images and advanced machine learning techniques such as deep learning (DL) and Vision Transformers (ViT). First, the paper systematically reviews the current diagnostic methodologies, analyzing the recent advancement in DL and ViT techniques through a comprehensive analysis of the review articles published between 2017 and 2024, excluding short reviews and overviews. The review not only analyses the existing knowledge but also identifies the critical gaps in the field as well as the lack of diversity of the comprehensive and diverse datasets for training the machine learning models. To address such limitations, the paper extensively evaluates DL-based models on publicly available datasets, analyzing key performance metrics such as accuracy, precision, recall, and F1-score. Our evaluations reveal that the current datasets are mostly limited to the narrow subsets of pulmonary diseases, which might lead to some challenges, including overfitting, poor generalization, and reduced possibility of using advanced machine learning techniques in real-world applications. For instance, DL and ViT models require extensive data for effective learning. The primary contribution of this paper is not only the review of the most recent articles and surveys of respiratory diseases and DL models, including ViT, but also introduces a novel, diverse dataset comprising 7867 X-ray images from 5263 patients across three local hospitals, covering 49 distinct pulmonary diseases. The dataset is expected to enhance DL and ViT model training and improve the generalization of those models in various real-world medical image scenarios. By addressing the data scarcity issue, this paper paves the for more reliable and robust disease classification, improving clinical decision-making. Additionally, the article highlights the critical challenges that still need to be addressed, such as dataset bias and variations of X-ray image quality, as well as the need for further clinical validation. Furthermore, the study underscores the critical role of DL in medical diagnosis and highlights the necessity of comprehensive, well-annotated datasets to improve model robustness and clinical reliability. Through these contributions, the paper provides the basis and foundation of future research on respiratory disease diagnosis using AI-driven methodologies. Although the paper tries to cover all the work done between 2017 and 2024, this research might have some limitations of this research, including the review period before 2017 might have foundational work. At the same time, the rapid development of AI might make the earlier methods less relevant.

Sex estimation from the variables of talocrural joint by using machine learning algorithms.

Ray A, Ray G, Kürtül İ, Şenol GT

pubmed logopapersJun 9 2025
This study has focused on sex determination from the variables estimated on X-ray images of the talocrural joint by using machine learning algorithms (ML). The variables of the mediolateral diameter of tibia (TMLD) and fibula (FMLD), the distance between the innermost points of the talocrural joint (DIT), the distance between the outermost points of the talocrural joint (DOT), and the distal articular surface of the tibia (TAS) estimated using X-ray images of 150 women and 150 men were evaluated by applying different ML methods. Logistic regression classifier, Decision Tree classifier, K-Nearest Neighbor classifier, Linear Discriminant Analysis, Naive Bayes and Random Forest classifier were used as algorithms. As a result of ML, an accuracy between 82 and 92 % was found. The highest rate of accuracy was achieved with RFC algorithm. DOT was the variable which contributed to the model at highest degree. Except for the variables of the age and FMLD, the other variables were found to be statistically significant in terms of sex difference. It was found that the variables of the talocrural joint were classified with high accuracy in terms of sex. In addition, morphometric data were found about the population and racial differences were emphasized.

Lack of children in public medical imaging data points to growing age bias in biomedical AI

Hua, S. B. Z., Heller, N., He, P., Towbin, A. J., Chen, I., Lu, A., Erdman, L.

medrxiv logopreprintJun 7 2025
Artificial intelligence (AI) is rapidly transforming healthcare, but its benefits are not reaching all patients equally. Children remain overlooked with only 17% of FDA-approved medical AI devices labeled for pediatric use. In this work, we demonstrate that this exclusion may stem from a fundamental data gap. Our systematic review of 181 public medical imaging datasets reveals that children represent just under 1% of available data, while the majority of machine learning imaging conference papers we surveyed utilized publicly available data for methods development. Much like systematic biases of other kinds in model development, past studies have demonstrated the manner in which pediatric representation in data used for models intended for the pediatric population is essential for model performance in that population. We add to these findings, showing that adult-trained chest radiograph models exhibit significant age bias when applied to pediatric populations, with higher false positive rates in younger children. This work underscores the urgent need for increased pediatric representation in publicly accessible medical datasets. We provide actionable recommendations for researchers, policymakers, and data curators to address this age equity gap and ensure AI benefits patients of all ages. 1-2 sentence summaryOur analysis reveals a critical healthcare age disparity: children represent less than 1% of public medical imaging datasets. This gap in representation leads to biased predictions across medical image foundation models, with the youngest patients facing the highest risk of misdiagnosis.

UANV: UNet-based attention network for thoracolumbar vertebral compression fracture angle measurement.

Lee Y, Kim J, Lee KC, An S, Cho Y, Ahn KS, Hur JW

pubmed logopapersJun 6 2025
Kyphosis is a prevalent spinal condition where the spine curves in the sagittal plane, resulting in spine deformities. Curvature estimation provides a powerful index to assess the deformation severity of scoliosis. In current clinical diagnosis, the standard curvature estimation method for quantitatively assessing the curvature is performed by measuring the vertebral angle, which is the angle between two lines, drawn perpendicular to the upper and lower endplates of the involved vertebra. However, manual Cobb angle measurement requires considerable time and effort, along with associated problems such as interobserver and intraobserver variations. Hence, in this study, we propose UNet-based Attention Network for Thoracolumbar Vertebral Compression Fracture Angle (UANV), a vertebra angle measuring model using lateral spinal X-ray based on a deep convolutional neural network (CNN). Specifically, we considered the detailed shape of each vertebral body with an attention mechanism and then recorded each edge of each vertebra to calculate vertebrae angles.

A Machine Learning Method to Determine Candidates for Total and Unicompartmental Knee Arthroplasty Based on a Voting Mechanism.

Zhang N, Zhang L, Xiao L, Li Z, Hao Z

pubmed logopapersJun 5 2025
Knee osteoarthritis (KOA) is a prevalent condition. Accurate selection between total knee arthroplasty (TKA) and unicompartmental knee arthroplasty (UKA) is crucial for optimal treatment in patients who have end-stage KOA, particularly for improving clinical outcomes and reducing healthcare costs. This study proposes a machine learning model based on a voting mechanism to enhance the accuracy of surgical decision-making for KOA patients. Radiographic data were collected from a high-volume joint arthroplasty practice, focusing on anterior-posterior, lateral, and skyline X-ray views. The dataset included 277 TKA and 293 UKA cases, each labeled through intraoperative observations (indicating whether TKA or UKA was the appropriate choice). A five-fold cross-validation approach was used for training and validation. In the proposed method, three base models were first trained independently on single-view images, and a voting mechanism was implemented to aggregate model outputs. The performance of the proposed method was evaluated by using metrics such as accuracy and the area under the receiver operating characteristic curve (AUC). The proposed method achieved an accuracy of 94.2% and an AUC of 0.98%, demonstrating superior performance compared to existing models. The voting mechanism enabled base models to effectively utilize the detailed features from all three X-ray views, leading to enhanced predictive accuracy and model interpretability. This study provides a high-accuracy method for surgical decision-making between TKA and UKA for KOA patients, requiring only standard X-rays and offering potential for clinical application in automated referrals and preoperative planning.

Artificial intelligence-based detection of dens invaginatus in panoramic radiographs.

Sarı AH, Sarı H, Magat G

pubmed logopapersJun 5 2025
The aim of this study was to automatically detect teeth with dens invaginatus (DI) in panoramic radiographs using deep learning algorithms and to compare the success of the algorithms. For this purpose, 400 panoramic radiographs with DI were collected from the faculty database and separated into 60% training, 20% validation and 20% test images. The training and validation images were labeled by oral, dental and maxillofacial radiologists and augmented with various augmentation methods, and the improved models were asked for the images allocated for the test phase and the results were evaluated according to performance measures including accuracy, sensitivity, F1 score and mean detection time. According to the test results, YOLOv8 achieved a precision, sensitivity and F1 score of 0.904 and was the fastest detection model with an average detection time of 0.041. The Faster R-CNN model achieved 0.912 precision, 0.904 sensitivity and 0.907 F1 score, with an average detection time of 0.1 s. The YOLOv9 algorithm showed the most successful performance with 0.946 precision, 0.930 sensitivity, 0.937 F1 score value and the average detection speed per image was 0.158 s. According to the results obtained, all models achieved over 90% success. YOLOv8 was relatively more successful in detection speed and YOLOv9 in other performance criteria. Faster R-CNN ranked second in all criteria.

Development of a deep learning model for measuring sagittal parameters on cervical spine X-ray.

Wang S, Li K, Zhang S, Zhang D, Hao Y, Zhou Y, Wang C, Zhao H, Ma Y, Zhao D, Chen J, Li X, Wang H, Li Z, Shi J, Wang X

pubmed logopapersJun 5 2025
To develop a deep learning model to automatically measure the curvature-related sagittal parameters on cervical spinal X-ray images. This retrospective study collected a total of 700 lateral cervical spine X-ray images from three hospitals, consisting of 500 training sets, 100 internal test sets, and 100 external test sets. 6 measured parameters and 34 landmarks were measured and labeled by two doctors and averaged as the gold standard. A Convolutional neural network (CNN) model was built by training on 500 images and testing on 200 images. Statistical analysis is used to evaluate labeling differences and model performance. The percentages of the difference in distance between landmarks within 4 mm were 96.90% (Dr. A vs. Dr. B), 98.47% (Dr. A vs. model), and 97.31% (Dr. B vs. model); within 3 mm were 94.88% (Dr. A vs. Dr. B), 96.43% (Dr. A vs. model), and 94.16% (Dr. B vs. model). The mean difference of the algorithmic model in labeling landmarks was 1.17 ± 1.14 mm. The mean absolute error (MAE) of the algorithmic model for the Borden method, Cervical curvature index (CCI), Vertebral centroid measurement cervical lordosis (CCL), C<sub>0</sub>-C<sub>7</sub> Cobb, C<sub>1</sub>-C<sub>7</sub> Cobb, C<sub>2</sub>-C<sub>7</sub> Cobb in the test sets are 1.67 mm, 2.01%, 3.22°, 2.37°, 2.49°, 2.81°, respectively; symmetric mean absolute percentage error (SMAPE) was 20.06%, 21.68%, 20.02%, 6.68%, 5.28%, 20.46%, respectively. Also, the algorithmic model of the six cervical sagittal parameters is in good agreement with the gold standard (intraclass correlation efficiency was 0.983; p < 0.001). Our deep learning algorithmic model had high accuracy in recognizing the landmarks of the cervical spine and automatically measuring cervical spine-related parameters, which can help radiologists improve their diagnostic efficiency.
Page 23 of 35341 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.