Latest Papers on Radiology AI. Tags: Musculoskeletal, Order: Best Match, Limit: 10.

Artificial intelligence algorithm improves radiologists' bone age assessment accuracy artificial intelligence algorithm improves radiologists' bone age assessment accuracy.

Chang TY, Chou TY, Jen IA, Yuh YS

•papers•May 15 2025

Artificial intelligence (AI) algorithms can provide rapid and precise radiographic bone age (BA) assessment. This study assessed the effects of an AI algorithm on the BA assessment performance of radiologists, and evaluated how automation bias could affect radiologists. In this prospective randomized crossover study, six radiologists with varying levels of experience (senior, mi-level, and junior) assessed cases from a test set of 200 standard BA radiographs. The test set was equally divided into two subsets: datasets A and B. Each radiologist assessed BA independently without AI assistance (A- B-) and with AI assistance (A+ B+). We used the mean of assessments made by two experts as the ground truth for accuracy assessment; subsequently, we calculated the mean absolute difference (MAD) between the radiologists' BA predictions and ground-truth BA and evaluated the proportion of estimates for which the MAD exceeded one year. Additionally, we compared the radiologists' performance under conditions of early AI assistance with their performance under conditions of delayed AI assistance; the radiologists were allowed to reject AI interpretations. The overall accuracy of senior, mid-level, and junior radiologists improved significantly with AI assistance than without AI assistance (MAD: 0.74 vs. 0.46 years, p < 0.001; proportion of assessments for which MAD exceeded 1 year: 24.0% vs. 8.4%, p < 0.001). The proportion of improved BA predictions with AI assistance (16.8%) was significantly higher than that of less accurate predictions with AI assistance (2.3%; p < 0.001). No consistent timing effect was observed between conditions of early and delayed AI assistance. Most disagreements between radiologists and AI occurred over images for patients aged ≤8 years. Senior radiologists had more disagreements than other radiologists. The AI algorithm improved the BA assessment accuracy of radiologists with varying experience levels. Automation bias was prone to affect less experienced radiologists.

X-Ray Classification Musculoskeletal Prospective Clinical Pilot Academic Lab

Segmentation of the thoracolumbar fascia in ultrasound imaging: a deep learning approach.

Bonaldi L, Pirri C, Giordani F, Fontanella CG, Stecco C, Uccheddu F

•papers•May 15 2025

Only in recent years it has been demonstrated that the thoracolumbar fascia is involved in low back pain (LBP), thus highlighting its implications for treatments. Furthermore, an easily accessible and non-invasive way to investigate the fascia in real time is the ultrasound examination, which to be reliable as is, it must overcome the challenges related to the configuration of the machine and the experience of the operator. Therefore, the lack of a clear understanding of the fascial system combined with the penalty related to the setting of the ultrasound acquisition has generated a gap that makes its effective evaluation difficult during clinical routine. The aim of the present work is to fill this gap by investigating the effectiveness of using a deep learning approach to segment the thoracolumbar fascia from ultrasound imaging. A total of 538 ultrasound images of the thoracolumbar fascia of LBP subjects were finally used to train and test a deep learning network. An additional test set (so-called Test set 2) was collected from another center, operator, machine manufacturer, patient cohort, and protocol to improve the generalizability of the study. A U-Net-based architecture was demonstrated to be able to segment these structures with a final training accuracy of 0.99 and a validation accuracy of 0.91. The accuracy of the prediction computed on a test set (87 images not included in the training set) reached the 0.94, with a mean intersection over union index of 0.82 and a Dice-score of 0.76. These latter metrics were outperformed by those in Test set 2. The validity of the predictions was also verified and confirmed by two expert clinicians. Automatic identification of the thoracolumbar fascia has shown promising results to thoroughly investigate its alteration and target a personalized rehabilitation intervention based on each patient-specific scenario.

Ultrasound Segmentation Musculoskeletal Retrospective Clinical In Silico Academic Lab

Large language models for efficient whole-organ MRI score-based reports and categorization in knee osteoarthritis.

Xie Y, Hu Z, Tao H, Hu Y, Liang H, Lu X, Wang L, Li X, Chen S

•papers•May 14 2025

To evaluate the performance of large language models (LLMs) in automatically generating whole-organ MRI score (WORMS)-based structured MRI reports and predicting osteoarthritis (OA) severity for the knee. A total of 160 consecutive patients suspected of OA were included. Knee MRI reports were reviewed by three radiologists to establish the WORMS reference standard for 39 key features. GPT-4o and GPT-4o-mini were prompted using in-context knowledge (ICK) and chain-of-thought (COT) to generate WORMS-based structured reports from original reports and to automatically predict the OA severity. Four Orthopedic surgeons reviewed original and LLM-generated reports to conduct pairwise preference and difficulty tests, and their review times were recorded. GPT-4o demonstrated perfect performance in extracting the laterality of the knee (accuracy = 100%). GPT-4o outperformed GPT-4o mini in generating WORMS reports (Accuracy: 93.9% vs 76.2%, respectively). GPT-4o achieved higher recall (87.3% s 46.7%, p < 0.001), while maintaining higher precision compared to GPT-4o mini (94.2% vs 71.2%, p < 0.001). For predicting OA severity, GPT-4o outperformed GPT-4o mini across all prompt strategies (best accuracy: 98.1% vs 68.7%). Surgeons found it easier to extract information and gave more preference to LLM-generated reports over the original reports (both p < 0.001) while spending less time on each report (51.27 ± 9.41 vs 87.42 ± 20.26 s, p < 0.001). GPT-4o generated expert multi-feature, WORMS-based reports from original free-text knee MRI reports. GPT-4o with COT achieved high accuracy in categorizing OA severity. Surgeons reported greater preference and higher efficiency when using LLM-generated reports. The perfect performance of generating WORMS-based reports and the high efficiency and ease of use suggest that integrating LLMs into clinical workflows could greatly enhance productivity and alleviate the documentation burden faced by clinicians in knee OA. GPT-4o successfully generated WORMS-based knee MRI reports. GPT-4o with COT prompting achieved impressive accuracy in categorizing knee OA severity. Greater preference and higher efficiency were reported for LLM-generated reports.

MRI LLM Radiology Report Musculoskeletal Retrospective Clinical In Silico Academic Lab GenAI

Synthetic Data-Enhanced Classification of Prevalent Osteoporotic Fractures Using Dual-Energy X-Ray Absorptiometry-Based Geometric and Material Parameters.

Quagliato L, Seo J, Hong J, Lee T, Chung YS

•papers•May 14 2025

Bone fracture risk assessment for osteoporotic patients is essential for implementing early countermeasures and preventing discomfort and hospitalization. Current methodologies, such as Fracture Risk Assessment Tool (FRAX), provide a risk assessment over a 5- to 10-year period rather than evaluating the bone's current health status. The database was collected by Ajou University Medical Center from 2017 to 2021. It included 9,260 patients, aged 55 to 99, comprising 242 femur fracture (FX) cases and 9,018 non-fracture (NFX) cases. To model the association of the bone's current health status with prevalent FXs, three prediction algorithms-extreme gradient boosting (XGB), support vector machine, and multilayer perceptron-were trained using two-dimensional dual-energy X-ray absorptiometry (2D-DXA) analysis results and subsequently benchmarked. The XGB classifier, which proved most effective, was then further refined using synthetic data generated by the adaptive synthetic oversampler to balance the FX and NFX classes and enhance boundary sharpness for better classification accuracy. The XGB model trained on raw data demonstrated good prediction capabilities, with an area under the curve (AUC) of 0.78 and an F1 score of 0.71 on test cases. The inclusion of synthetic data improved classification accuracy in terms of both specificity and sensitivity, resulting in an AUC of 0.99 and an F1 score of 0.98. The proposed methodology demonstrates that current bone health can be assessed through post-processed results from 2D-DXA analysis. Moreover, it was also shown that synthetic data can help stabilize uneven databases by balancing majority and minority classes, thereby significantly improving classification performance.

X-Ray Classification Musculoskeletal Retrospective Clinical In Silico Academic Lab GenAI

Clinical utility of ultrasound and MRI in rheumatoid arthritis: An expert review.

Kellner DA, Morris NT, Lee SM, Baker JF, Chu P, Ranganath VK, Kaeley GS, Yang HH

•papers•May 14 2025

Musculoskeletal ultrasound (MSUS) and magnetic resonance imaging (MRI) are advanced imaging techniques that are increasingly important in the diagnosis and management of rheumatoid arthritis (RA) and have significantly enhanced the rheumatologist's ability to assess RA disease activity and progression. This review serves as a five-year update to our previous publication on the contemporary role of imaging in RA, emphasizing the continued importance of MSUS and MRI in clinical practice and their expanding utility. The review examines the role of MSUS in diagnosing RA, differentiating RA from mimickers, scoring systems and quality control measures, novel longitudinal approaches to disease monitoring, and patient populations that may benefit most from MSUS. It also examines the role of MRI in diagnosing pre-clinical and early RA, disease activity monitoring, research and clinical trials, and development of alternative scoring approaches utilizing artificial intelligence. Finally, the role of MRI in RA diagnosis and management is summarized, and selected practice points offer key tips for integrating MSUS and MRI into clinical practice.

Mixed Modality Classification Musculoskeletal Review Concept Academic Lab

Total radius BMD correlates with the hip and lumbar spine BMD among post-menopausal patients with fragility wrist fracture in a machine learning model.

Ruotsalainen T, Panfilov E, Thevenot J, Tiulpin A, Saarakkala S, Niinimäki J, Lehenkari P, Valkealahti M

•papers•May 14 2025

Osteoporosis screening should be systematic in the group of over 50-year-old females with a radius fracture. We tested a phantom combined with machine learning model and studied osteoporosis-related variables. This machine learning model for screening osteoporosis using plain radiographs requires further investigation in larger cohorts to assess its potential as a replacement for DXA measurements in settings where DXA is not available. The main purpose of this study was to improve osteoporosis screening, especially in post-menopausal patients with fragility wrist fractures. The secondary objective was to increase understanding of the connection between osteoporosis and aging, as well as other risk factors. We collected data on 83 females > 50 years old with a distal radius fracture treated at Oulu University Hospital in 2019-2020. The data included basic patient information, WHO FRAX tool, blood tests, X-ray imaging of the fractured wrist, and DXA scanning of the non-fractured forearm, both hips, and the lumbar spine. Machine learning was used in combination with a custom phantom. Eighty-five percent of the study population had osteopenia or osteoporosis. Only 28.4% of patients had increased bone resorption activity measured by ICTP values. Total radius BMD correlated with other osteoporosis-related variables (age r = - 0.494, BMI r = 0.273, FRAX osteoporotic fracture risk r = - 0.419, FRAX hip fracture risk r = - 0.433, hip BMD r = 0.435, and lumbar spine BMD r = 0.645), but the ultra distal (UD) radius BMD did not. Our custom phantom combined with a machine learning model showed potential for screening osteoporosis, with the class-wise accuracies for "Osteoporotic vs. osteopenic & normal bone" of 76% and 75%, respectively. We suggest osteoporosis screening for all females over 50 years old with wrist fractures. We found that the total radius BMD correlates with the central BMD. Due to the limited sample size in the phantom and machine learning parts of the study, further research is needed to make a clinically useful tool for screening osteoporosis.

X-Ray Classification Musculoskeletal Retrospective Clinical Prototype Academic Lab

Rethinking femoral neck anteversion assessment: a novel automated 3D CT method compared to traditional manual techniques.

Xiao H, Yibulayimu S, Zhao C, Sang Y, Chen Y, Ge Y, Sun Q, Ming Y, Bei M, Zhu G, Song Y, Wang Y, Wu X

•papers•May 13 2025

To evaluate the accuracy and reliability of a novel automated 3D CT-based method for measuring femoral neck anteversion (FNA) compared to three traditional manual methods. A total of 126 femurs from 63 full-length CT scans (35 men and 28 women; average age: 52.0 ± 14.7 years) were analyzed. The automated method used a deep learning network for femur segmentation, landmark identification, and anteversion calculation, with results generated based on two axes: Auto_GT (using the greater trochanter-to-intercondylar notch center axis) and Auto_P (using the piriformis fossa-to-intercondylar notch center axis). These results were validated through manual landmark annotation. The same dataset was assessed using three conventional manual methods: Murphy, Reikeras, and Lee methods. Intra- and inter-observer reliability were assessed using intraclass correlation coefficients (ICCs), and pairwise comparisons analyzed correlations and differences between methods. The automated methods produced consistent FNA measurements (Auto_GT: 17.59 ± 9.16° vs. Auto_P: 17.37 ± 9.17° on the right; 15.08 ± 9.88° vs. 14.84 ± 9.90° on the left). Intra-observer ICCs ranged from 0.864 to 0.961, and inter-observer ICCs between Auto_GT and the manual methods were high, except for the Lee method. No significant differences were observed between the two automated methods or between the automated and manual verification methods. Moreover, strong correlations (R > 0.9, p < 0.001) were found between Auto_GT and the manual methods. The novel automated 3D CT-based method demonstrates strong reproducibility and reliability for measuring femoral neck anteversion, with performance comparable to traditional manual techniques. These results indicate its potential utility for preoperative planning, postoperative evaluation, and computer-assisted orthopedic procedures. Not applicable.

CT Segmentation Musculoskeletal Retrospective Clinical In Silico Academic Lab

Individual thigh muscle and proximal femoral features predict displacement in femoral neck Fractures: An AI-driven CT analysis.

Yoo JI, Kim HS, Kim DY, Byun DW, Ha YC, Lee YK

•papers•May 13 2025

Hip fractures, particularly among the elderly, impose a significant public health burden due to increased morbidity and mortality. Femoral neck fractures, commonly resulting from low-energy falls, can lead to severe complications such as avascular necrosis, and often necessitate total hip arthroplasty. This study harnesses AI to enhance musculoskeletal assessments by performing automatic muscle segmentation on whole thigh CT scans and detailed cortical measurements using the StradView program. The primary aim is to improve the prediction and prevention of severe femoral neck fractures, ultimately supporting more effective rehabilitation and treatment strategies. This study measured anatomical features from whole thigh CT scans of 60 femoral neck fracture patients. An AI-driven individual muscle segmentation model (a dice score of 0.84) segmented 27 muscles in the thigh region, to calculate muscle volumes. Proximal femoral bone parameters were measured using StradView, including average cortical thickness, inner density and FWHM at four regions. Correlation analysis evaluated relationships between muscle features, cortical parameters, and fracture displacement. Machine learning models (Random Forest, SVM and Multi-layer Perceptron) predicted displacement using these variables. Correlation analysis showed significant associations between femoral neck displacement and trabecular density at the femoral neck/intertrochanter, as well as volumes of specific thigh muscles such as the Tensor fasciae latae. Machine learning models using a combined feature set of thigh muscle volumes and proximal femoral parameters performed best in predicting displacement, with the Random Forest model achieving an F1 score of 0.91 and SVM model 0.93. Decreased volumes of the Tensor fasciae latae, Rectus femoris, and Semimembranosus muscles, coupled with reduced trabecular density at the femoral neck and intertrochanter, were significantly associated with increased fracture displacement. Notably, our SVM model-integrating both muscle and femoral features-achieved the highest predictive performance. These findings underscore the critical importance of muscle strength and bone density in rehabilitation planning and highlight the potential of AI-driven predictive models for improving clinical outcomes in femoral neck fractures.

CT Segmentation Musculoskeletal Retrospective Clinical In Silico Academic Lab

The automatic pelvic screw corridor planning for intact pelvises based on deep learning deformable registration.

Ju F, Chai X, Zhao J, Dong M

•papers•May 13 2025

Percutaneous screw fixation technique in pelvic trauma surgery is an extremely challenging operation that typically requires a trial-and-error insertion process under the guidance of continuous intraoperative X-ray. This process can be simplified by utilizing surgical navigation systems. Understanding the complexity of the intraosseous pelvis corridor is essential for establishing the optimal screw corridor, which further facilitates preoperative planning and intraoperative application. Traditional screw corridor search algorithms necessitate traversing the entrance and exit areas of the screw and calculating the distance from the corridor axis to the bone surface to ascertain the location of the screw. This process is computationally complex, and manual measurement by the physician is time consuming, labor intensive, and empirically dependent. In this study, we propose an automated planning algorithm for pelvic screw corridors based on deep learning deformable registration technology, which can efficiently and accurately identify the optimal screw corridors. Compared to traditional methods, the innovations of this study include: (1) the introduction of corridor safety range constraints on screw positioning, which enhances search efficiency; (2) the application of deep learning deformable registration to facilitate the automatic annotation of the screw entrance and exit areas, as well as the safety range of the corridor; and (3) the development of a highly efficient algorithm for optimal corridor searching, quickly determining the corridor without traversing the entrance and exit areas and enhancing efficiency via a vector-based diameter calculation method. The whole framework of the algorithm consists of three key components: atlas generation module, deformable registration and optimal corridor searching strategy. In the experiments, we test the performance of the proposed algorithm on 198 intact pelvises for calculating the optimal corridor of anterior column corridor and S1 sacroiliac screws. The results show that the new algorithm can increase the corridor diameter by 2.1%-3.3% compared to manual measurements, while significantly reducing the average time from 1038s and 3398s to 18.9s and 26.7s on anterior column corridor and S1 sacroiliac corridor, respectively, compared to the traditional screw searching algorithm. This demonstrates the advantages of the algorithm in terms of efficiency and accuracy. However, the current method is validated only on intact pelvises; further research is required for pelvic fracture scenarios.

CT Registration Musculoskeletal Retrospective Clinical In Silico Academic Lab

A comparison of performance of DeepSeek-R1 model-generated responses to musculoskeletal radiology queries against ChatGPT-4 and ChatGPT-4o - A feasibility study.

Uldin H, Saran S, Gandikota G, Iyengar KP, Vaishya R, Parmar Y, Rasul F, Botchu R

•papers•May 12 2025

Artificial Intelligence (AI) has transformed society and chatbots using Large Language Models (LLM) are playing an increasing role in scientific research. This study aims to assess and compare the efficacy of newer DeepSeek R1 and ChatGPT-4 and 4o models in answering scientific questions about recent research. We compared output generated from ChatGPT-4, ChatGPT-4o, and DeepSeek-R1 in response to ten standardized questions in the setting of musculoskeletal (MSK) radiology. These were independently analyzed by one MSK radiologist and one final-year MSK radiology trainee and graded using a Likert scale from 1 to 5 (1 being inaccurate to 5 being accurate). Five DeepSeek answers were significantly inaccurate and provided fictitious references only on prompting. All ChatGPT-4 and 4o answers were well-written with good content, the latter including useful and comprehensive references. ChatGPT-4o generates structured research answers to questions on recent MSK radiology research with useful references in all our cases, enabling reliable usage. DeepSeek-R1 generates articles that, on the other hand, may appear authentic to the unsuspecting eye but contain a higher amount of falsified and inaccurate information in the current version. Further iterations may improve these accuracies.

Mixed Modality LLM Radiology Report Musculoskeletal Retrospective Clinical In Silico Academic Lab GenAI Reproducibility

Artificial intelligence algorithm improves radiologists' bone age assessment accuracy artificial intelligence algorithm improves radiologists' bone age assessment accuracy.

Segmentation of the thoracolumbar fascia in ultrasound imaging: a deep learning approach.

Large language models for efficient whole-organ MRI score-based reports and categorization in knee osteoarthritis.

Synthetic Data-Enhanced Classification of Prevalent Osteoporotic Fractures Using Dual-Energy X-Ray Absorptiometry-Based Geometric and Material Parameters.

Clinical utility of ultrasound and MRI in rheumatoid arthritis: An expert review.

Total radius BMD correlates with the hip and lumbar spine BMD among post-menopausal patients with fragility wrist fracture in a machine learning model.

Rethinking femoral neck anteversion assessment: a novel automated 3D CT method compared to traditional manual techniques.

Individual thigh muscle and proximal femoral features predict displacement in femoral neck Fractures: An AI-driven CT analysis.

The automatic pelvic screw corridor planning for intact pelvises based on deep learning deformable registration.

A comparison of performance of DeepSeek-R1 model-generated responses to musculoskeletal radiology queries against ChatGPT-4 and ChatGPT-4o - A feasibility study.

Ready to Sharpen Your Edge?