Latest Papers on Radiology AI. Tags: Musculoskeletal, Order: Best Match, Limit: 10.

Update on the detection of frailty in older adults: a multicenter cohort machine learning-based study protocol.

Fernández-Carnero S, Martínez-Pozas O, Pecos-Martín D, Pardo-Gómez A, Cuenca-Zaldívar JN, Sánchez-Romero EA

•papers•May 21 2025

This study aims to investigate the relationship between muscle activation variables assessed via ultrasound and the comprehensive assessment of geriatric patients, as well as to analyze ultrasound images to determine their correlation with morbimortality factors in frail patients. The present cohort study will be conducted in 500 older adults diagnosed with frailty. A multicenter study will be conducted among the day care centers and nursing homes. This will be achieved through the evaluation of frail older adults via instrumental and functional tests, along with specific ultrasound images to study sarcopenia and nutrition, followed by a detailed analysis of the correlation between all collected variables. This study aims to investigate the correlation between ultrasound-assessed muscle activation variables and the overall health of geriatric patients. It addresses the limitations of previous research by including a large sample size of 500 patients and measuring various muscle parameters beyond thickness. Additionally, it aims to analyze ultrasound images to identify markers associated with higher risk of complications in frail patients. The study involves frail older adults undergoing functional tests and specific ultrasound examinations. A comprehensive analysis of functional, ultrasound, and nutritional variables will be conducted to understand their correlation with overall health and risk of complications in frail older patients. The study was approved by the Research Ethics Committee of the Hospital Universitario Puerta de Hierro, Madrid, Spain (Act nº 18/2023). In addition, the study was registered with https://clinicaltrials.gov/ (NCT06218121).

Ultrasound Classification Musculoskeletal Prospective Clinical Pilot Academic Lab

Artificial Intelligence and Musculoskeletal Surgical Applications.

Oettl FC, Zsidai B, Oeding JF, Samuelsson K

•papers•May 20 2025

Artificial intelligence (AI) has emerged as a transformative force in orthopedic surgery. Potentially encompassing pre-, intra-, and postoperative processes, it can process complex medical imaging, provide real-time surgical guidance, and analyze large datasets for outcome prediction and optimization. AI has shown improvements in surgical precision, efficiency, and patient outcomes across orthopedic subspecialties, and large language models and agentic AI systems are expanding AI utility beyond surgical applications into areas such as clinical documentation, patient education, and autonomous decision support. The successful implementation of AI in orthopedic surgery requires careful attention to validation, regulatory compliance, and healthcare system integration. As these technologies continue to advance, maintaining the balance between innovation and patient safety remains crucial, with the ultimate goal of achieving more personalized, efficient, and equitable healthcare delivery while preserving the essential role of human clinical judgment. This review examines the current landscape and future trajectory of AI applications in orthopedic surgery, highlighting both technological advances and their clinical impact. Studies have suggested that AI-assisted procedures achieve higher accuracy and better functional outcomes compared to conventional methods, while reducing operative times and complications. However, these technologies are designed to augment rather than replace clinical expertise, serving as sophisticated tools to enhance surgeons' capabilities and improve patient care.

Mixed Modality Detection Musculoskeletal Review In Silico Academic Lab

Advanced feature fusion of radiomics and deep learning for accurate detection of wrist fractures on X-ray images.

Saadh MJ, Hussain QM, Albadr RJ, Doshi H, Rekha MM, Kundlas M, Pal A, Rizaev J, Taher WM, Alwan M, Jawad MJ, Al-Nuaimi AMA, Farhood B

•papers•May 20 2025

The aim of this study was to develop a hybrid diagnostic framework integrating radiomic and deep features for accurate and reproducible detection and classification of wrist fractures using X-ray images. A total of 3,537 X-ray images, including 1,871 fracture and 1,666 non-fracture cases, were collected from three healthcare centers. Radiomic features were extracted using the PyRadiomics library, and deep features were derived from the bottleneck layer of an autoencoder. Both feature modalities underwent reliability assessment via Intraclass Correlation Coefficient (ICC) and cosine similarity. Feature selection methods, including ANOVA, Mutual Information (MI), Principal Component Analysis (PCA), and Recursive Feature Elimination (RFE), were applied to optimize the feature set. Classifiers such as XGBoost, CatBoost, Random Forest, and a Voting Classifier were used to evaluate diagnostic performance. The dataset was divided into training (70%) and testing (30%) sets, and metrics such as accuracy, sensitivity, and AUC-ROC were used for evaluation. The combined radiomic and deep feature approach consistently outperformed standalone methods. The Voting Classifier paired with MI achieved the highest performance, with a test accuracy of 95%, sensitivity of 94%, and AUC-ROC of 96%. The end-to-end model achieved competitive results with an accuracy of 93% and AUC-ROC of 94%. SHAP analysis and t-SNE visualizations confirmed the interpretability and robustness of the selected features. This hybrid framework demonstrates the potential for integrating radiomic and deep features to enhance diagnostic performance for wrist and forearm fractures, providing a reliable and interpretable solution suitable for clinical applications.

X-Ray Classification Musculoskeletal Retrospective Clinical In Silico Academic Lab

Development and validation of ultrasound-based radiomics deep learning model to identify bone erosion in rheumatoid arthritis.

Yan L, Xu J, Ye X, Lin M, Gong Y, Fang Y, Chen S

•papers•May 19 2025

To develop and validate a deep learning radiomics fusion model (DLR) based on ultrasound (US) images to identify bone erosion in rheumatoid arthritis (RA) patients. A total of 432 patients with RA at two institutions were collected. Three hundred twelve patients from center 1 were randomly divided into a training set (N = 218) and an internal test set (N = 94) in a 7:3 ratio; meanwhile, 124 patients from center 2 were as an external test set. Radiomics (Rad) and deep learning (DL) features were extracted based on hand-crafted radiomics and deep transfer learning networks. The least absolute shrinkage and selection operator regression was employed to establish DLR fusion feature from the Rad and DL features. Subsequently, 10 machine learning algorithms were used to construct models and the final optimal model was selected. The performance of models was evaluated using receiver operating characteristic (ROC) and decision curve analysis (DCA). The diagnostic efficacy of sonographers was compared with and without the assistance of the optimal model. LR was chosen as the optimal algorithm for model construction account for superior performance (Rad/DL/DLR: area under the curve [AUC] = 0.906/0.974/0.979) in the training set. In the internal test set, DLR_LR as the final model had the highest AUC (AUC = 0.966), which was also validated in the external test set (AUC = 0.932). With the aid of DLR_LR model, the overall performance of both junior and senior sonographers improved significantly (P < 0.05), and there was no significant difference between the junior sonographer with DLR_LR model assistance and the senior sonographer without assistance (P > 0.05). DLR model based on US images is the best performer and is expected to become an important tool for identifying bone erosion in RA patients. Key Points • DLR model based on US images is the best performer in identifying BE in RA patients. • DLR model may assist the sonographers to improve the accuracy of BE evaluations.

Ultrasound Classification Musculoskeletal Retrospective Clinical In Silico Academic Lab

An overview of artificial intelligence and machine learning in shoulder surgery.

Cho SH, Kim YS

•papers•May 19 2025

Machine learning (ML), a subset of artificial intelligence (AI), utilizes advanced algorithms to learn patterns from data, enabling accurate predictions and decision-making without explicit programming. In orthopedic surgery, ML is transforming clinical practice, particularly in shoulder arthroplasty and rotator cuff tears (RCTs) management. This review explores the fundamental paradigms of ML, including supervised, unsupervised, and reinforcement learning, alongside key algorithms such as XGBoost, neural networks, and generative adversarial networks. In shoulder arthroplasty, ML accurately predicts postoperative outcomes, complications, and implant selection, facilitating personalized surgical planning and cost optimization. Predictive models, including ensemble learning methods, achieve over 90% accuracy in forecasting complications, while neural networks enhance surgical precision through AI-assisted navigation. In RCTs treatment, ML enhances diagnostic accuracy using deep learning models on magnetic resonance imaging and ultrasound, achieving area under the curve values exceeding 0.90. ML models also predict tear reparability with 85% accuracy and postoperative functional outcomes, including range of motion and patient-reported outcomes. Despite remarkable advancements, challenges such as data variability, model interpretability, and integration into clinical workflows persist. Future directions involve federated learning for robust model generalization and explainable AI to enhance transparency. ML continues to revolutionize orthopedic care by providing data-driven, personalized treatment strategies and optimizing surgical outcomes.

Mixed Modality Classification Musculoskeletal Review In Silico Academic Lab

Improving Deep Learning-Based Grading of Partial-thickness Supraspinatus Tendon Tears with Guided Diffusion Augmentation.

Ni M, Jiesisibieke D, Zhao Y, Wang Q, Gao L, Tian C, Yuan H

•papers•May 19 2025

To develop and validate a deep learning system with guided diffusion-based data augmentation for grading partial-thickness supraspinatus tendon (SST) tears and to compare its performance with experienced radiologists, including external validation. This retrospective study included 1150 patients with arthroscopically confirmed SST tears, divided into a training set (741 patients), validation set (185 patients), and internal test set (185 patients). An independent external test set of 224 patients was used for generalizability assessment. To address data imbalance, MRI images were augmented using a guided diffusion model. A ResNet-34 model was employed for Ellman grading of bursal-sided and articular-sided partial-thickness tears across different MRI sequences (oblique coronal [OCOR], oblique sagittal [OSAG], and combined OCOR+OSAG). Performance was evaluated using AUC and precision-recall curves, and compared to three experienced musculoskeletal (MSK) radiologists. The DeLong test was used to compare performance across different sequence combinations. A total of 26,020 OCOR images and 26,356 OSAG images were generated using the guided diffusion model. For bursal-sided partial-thickness tears in the internal dataset, the model achieved AUCs of 0.99, 0.98, and 0.97 for OCOR, OSAG, and combined sequences, respectively, while for articular-sided tears, AUCs were 0.99, 0.99, and 0.99. The DeLong test showed no significant differences among sequence combinations (P=0.17, 0.14, 0.07). In the external dataset, the combined-sequence model achieved AUCs of 0.99, 0.97, and 0.97 for bursal-sided tears and 0.99, 0.95, and 0.95 for articular-sided tears. Radiologists demonstrated an ICC of 0.99, but their grading performance was significantly lower than the ResNet-34 model (P<0.001). The deep learning system improved grading consistency and significantly reduced evaluation time, while guided diffusion augmentation enhanced model robustness. The proposed deep learning system provides a reliable and efficient method for grading partial-thickness SST tears, achieving radiologist-level accuracy with greater consistency and faster evaluation speed.

MRI Classification Musculoskeletal Retrospective Clinical In Silico Academic Lab

Prediction of cervical spondylotic myelopathy from a plain radiograph using deep learning with convolutional neural networks.

Tachi H, Kokabu T, Suzuki H, Ishikawa Y, Yabu A, Yanagihashi Y, Hyakumachi T, Shimizu T, Endo T, Ohnishi T, Ukeba D, Sudo H, Yamada K, Iwasaki N

•papers•May 17 2025

This study aimed to develop deep learning algorithms (DLAs) utilising convolutional neural networks (CNNs) to classify cervical spondylotic myelopathy (CSM) and cervical spondylotic radiculopathy (CSR) from plain cervical spine radiographs. Data from 300 patients (150 with CSM and 150 with CSR) were used for internal validation (IV) using five-fold cross-validation strategy. Additionally, 100 patients (50 with CSM and 50 with CSR) were included in the external validation (EV). Two DLAs were trained using CNNs on plain radiographs from C3-C6 for the binary classification of CSM and CSR, and for the prediction of the spinal canal area rate using magnetic resonance imaging. Model performance was evaluated on external data using metrics such as area under the curve (AUC), accuracy, and likelihood ratios. For the binary classification, the AUC ranged from 0.84 to 0.96, with accuracy between 78% and 95% during IV. In the EV, the AUC and accuracy were 0.96 and 90%, respectively. For the spinal canal area rate, correlation coefficients during five-fold cross-validation ranged from 0.57 to 0.64, with a mean correlation of 0.61 observed in the EV. DLAs developed with CNNs demonstrated promising accuracy for classifying CSM and CSR from plain radiographs. These algorithms have the potential to assist non-specialists in identifying patients who require further evaluation or referral to spine specialists, thereby reducing delays in the diagnosis and treatment of CSM.

X-Ray Classification Musculoskeletal Retrospective Clinical In Silico Academic Lab

AI in motion: the impact of data augmentation strategies on mitigating MRI motion artifacts.

Westfechtel SD, Kußmann K, Aßmann C, Huppertz MS, Siepmann RM, Lemainque T, Winter VR, Barabasch A, Kuhl CK, Truhn D, Nebelung S

•papers•May 17 2025

Artifacts in clinical MRI can compromise the performance of AI models. This study evaluates how different data augmentation strategies affect an AI model's segmentation performance under variable artifact severity. We used an AI model based on the nnU-Net architecture to automatically quantify lower limb alignment using axial T2-weighted MR images. Three versions of the AI model were trained with different augmentation strategies: (1) no augmentation ("baseline"), (2) standard nnU-net augmentations ("default"), and (3) "default" plus augmentations that emulate MR artifacts ("MRI-specific"). Model performance was tested on 600 MR image stacks (right and left; hip, knee, and ankle) from 20 healthy participants (mean age, 23 ± 3 years, 17 men), each imaged five times under standardized motion to induce artifacts. Two radiologists graded each stack's artifact severity as none, mild, moderate, and severe, and manually measured torsional angles. Segmentation quality was assessed using the Dice similarity coefficient (DSC), while torsional angles were compared between manual and automatic measurements using mean absolute deviation (MAD), intraclass correlation coefficient (ICC), and Pearson's correlation coefficient (r). Statistical analysis included parametric tests and a Linear Mixed-Effects Model. MRI-specific augmentation resulted in slightly (yet not significantly) better performance than the default strategy. Segmentation quality decreased with increasing artifact severity, which was partially mitigated by default and MRI-specific augmentations (e.g., severe artifacts, proximal femur: DSCbaseline = 0.58 ± 0.22; DSCdefault = 0.72 ± 0.22; DSCMRI-specific = 0.79 ± 0.14 [p < 0.001]). These augmentations also maintained precise torsional angle measurements (e.g., severe artifacts, femoral torsion: MADbaseline = 20.6 ± 23.5°; MADdefault = 7.0 ± 13.0°; MADMRI-specific = 5.7 ± 9.5° [p < 0.001]; ICCbaseline = -0.10 [p = 0.63; 95% CI: -0.61 to 0.47]; ICCdefault = 0.38 [p = 0.08; -0.17 to 0.76]; ICCMRI-specific = 0.86 [p < 0.001; 0.62 to 0.95]; rbaseline = 0.58 [p < 0.001; 0.44 to 0.69]; rdefault = 0.68 [p < 0.001; 0.56 to 0.77]; rMRI-specific = 0.86 [p < 0.001; 0.81 to 0.9]). Motion artifacts negatively impact AI models, but general-purpose augmentations enhance robustness effectively. MRI-specific augmentations offer minimal additional benefit. Question Motion artifacts negatively impact the performance of diagnostic AI models for MRI, but mitigation methods remain largely unexplored. Findings Domain-specific augmentation during training can improve the robustness and performance of a model for quantifying lower limb alignment in the presence of severe artifacts. Clinical relevance Excellent robustness and accuracy are crucial for deploying diagnostic AI models in clinical practice. Including domain knowledge in model training can benefit clinical adoption.

MRI Segmentation Musculoskeletal Retrospective Clinical In Silico Academic Lab

High-Performance Prompting for LLM Extraction of Compression Fracture Findings from Radiology Reports.

Kanani MM, Monawer A, Brown L, King WE, Miller ZD, Venugopal N, Heagerty PJ, Jarvik JG, Cohen T, Cross NM

•papers•May 16 2025

Extracting information from radiology reports can provide critical data to empower many radiology workflows. For spinal compression fractures, these data can facilitate evidence-based care for at-risk populations. Manual extraction from free-text reports is laborious, and error-prone. Large language models (LLMs) have shown promise; however, fine-tuning strategies to optimize performance in specific tasks can be resource intensive. A variety of prompting strategies have achieved similar results with fewer demands. Our study pioneers the use of Meta's Llama 3.1, together with prompt-based strategies, for automated extraction of compression fractures from free-text radiology reports, outputting structured data without model training. We tested performance on a time-based sample of CT exams covering the spine from 2/20/2024 to 2/22/2024 acquired across our healthcare enterprise (637 anonymized reports, age 18-102, 47% Female). Ground truth annotations were manually generated and compared against the performance of three models (Llama 3.1 70B, Llama 3.1 8B, and Vicuna 13B) with nine different prompting configurations for a total of 27 model/prompt experiments. The highest F1 score (0.91) was achieved by the 70B Llama 3.1 model when provided with a radiologist-written background, with similar results when the background was written by a separate LLM (0.86). The addition of few-shot examples to these prompts had variable impact on F1 measurements (0.89, 0.84 respectively). Comparable ROC-AUC and PR-AUC performance was observed. Our work demonstrated that an open-weights LLM excelled at extracting compression fractures findings from free-text radiology reports using prompt-based techniques without requiring extensive manually labeled examples for model training.

CT LLM Radiology Report Musculoskeletal Retrospective Clinical In Silico Academic Lab GenAI

Automated CT segmentation for lower extremity tissues in lymphedema evaluation using deep learning.

Na S, Choi SJ, Ko Y, Urooj B, Huh J, Cha S, Jung C, Cheon H, Jeon JY, Kim KW

•papers•May 16 2025

Clinical assessment of lymphedema, particularly for lymphedema severity and fluid-fibrotic lesions, remains challenging with traditional methods. We aimed to develop and validate a deep learning segmentation tool for automated tissue component analysis in lower extremity CT scans. For development datasets, lower extremity CT venography scans were collected in 118 patients with gynecologic cancers for algorithm training. Reference standards were created by segmentation of fat, muscle, and fluid-fibrotic tissue components using 3D slicer. A deep learning model based on the Unet++ architecture with an EfficientNet-B7 encoder was developed and trained. Segmentation accuracy of the deep learning model was validated in an internal validation set (n = 10) and an external validation set (n = 10) using Dice similarity coefficient (DSC) and volumetric similarity (VS). A graphical user interface (GUI) tool was developed for the visualization of the segmentation results. Our deep learning algorithm achieved high segmentation accuracy. Mean DSCs for each component and all components ranged from 0.945 to 0.999 in the internal validation set and 0.946 to 0.999 in the external validation set. Similar performance was observed in the VS, with mean VSs for all components ranging from 0.97 to 0.999. In volumetric analysis, mean volumes of the entire leg and each component did not differ significantly between reference standard and deep learning measurements (p > 0.05). Our GUI displays lymphedema mapping, highlighting segmented fat, muscle, and fluid-fibrotic components in the entire leg. Our deep learning algorithm provides an automated segmentation tool enabling accurate segmentation, volume measurement of tissue component, and lymphedema mapping. Question Clinical assessment of lymphedema remains challenging, particularly for tissue segmentation and quantitative severity evaluation. Findings A deep learning algorithm achieved DSCs > 0.95 and VS > 0.97 for fat, muscle, and fluid-fibrotic components in internal and external validation datasets. Clinical relevance The developed deep learning tool accurately segments and quantifies lower extremity tissue components on CT scans, enabling automated lymphedema evaluation and mapping with high segmentation accuracy.

CT Segmentation Musculoskeletal Retrospective Clinical In Silico Academic Lab

Update on the detection of frailty in older adults: a multicenter cohort machine learning-based study protocol.

Artificial Intelligence and Musculoskeletal Surgical Applications.

Advanced feature fusion of radiomics and deep learning for accurate detection of wrist fractures on X-ray images.

Development and validation of ultrasound-based radiomics deep learning model to identify bone erosion in rheumatoid arthritis.

An overview of artificial intelligence and machine learning in shoulder surgery.

Improving Deep Learning-Based Grading of Partial-thickness Supraspinatus Tendon Tears with Guided Diffusion Augmentation.

Prediction of cervical spondylotic myelopathy from a plain radiograph using deep learning with convolutional neural networks.

AI in motion: the impact of data augmentation strategies on mitigating MRI motion artifacts.

High-Performance Prompting for LLM Extraction of Compression Fracture Findings from Radiology Reports.

Automated CT segmentation for lower extremity tissues in lymphedema evaluation using deep learning.

Ready to Sharpen Your Edge?