Latest Papers on Radiology AI. Tags: Musculoskeletal, Order: Best Match, Limit: 10.

Advances and challenges in AI-assisted MRI for lumbar disc degeneration detection and classification.

Zhao P, Zhu S

•papers•Jul 25 2025

Intervertebral disc degeneration (IDD) is a major contributor to chronic low back pain. Magnetic resonance imaging (MRI) serves as the gold standard for IDD assessment, yet manual grading is often subjective and inconsistent. With advances in artificial intelligence (AI), particularly deep learning, automated detection and classification of IDD from MRI has become increasingly feasible. This narrative review aims to provide a comprehensive overview of AI applications-especially machine learning and deep learning techniques-for MRI-based detection and grading of lumbar disc degeneration, highlighting their clinical value, current limitations, and future directions. Relevant studies were reviewed and summarized based on thematic structure. The review covers classical methods (e.g., support vector machines), deep learning models (e.g., CNNs, SpineNet, ResNet, U-Net), and hybrid approaches incorporating transformers and multitask learning. Technical details, model architectures, performance metrics, and representative datasets were synthesized and discussed. AI systems have demonstrated promising performance in automatic IDD grading, in some cases matching or surpassing expert radiologists. CNN-based models showed high accuracy and reproducibility, while hybrid models further enhanced segmentation and classification tasks. However, challenges remain in generalizability, data imbalance, interpretability, and regulatory integration. Tools such as Grad-CAM and SHAP improve model transparency, while methods like few-shot learning and data augmentation can alleviate data limitations. AI-assisted analysis of MRI for lumbar disc degeneration offers significant potential to enhance diagnostic efficiency and consistency. While current models are encouraging, real-world clinical implementation requires further advancements in interpretability, data diversity, ethical standards, and large-scale validation.

MRI Classification Musculoskeletal Review In Silico

Could a New Method of Acromiohumeral Distance Measurement Emerge? Artificial Intelligence vs. Physician.

Dede BT, Çakar İ, Oğuz M, Alyanak B, Bağcıer F

•papers•Jul 25 2025

The aim of this study was to evaluate the reliability of ChatGPT-4 measurement of acromiohumeral distance (AHD), a popular assessment in patients with shoulder pain. In this retrospective study, 71 registered shoulder magnetic resonance imaging (MRI) scans were included. AHD measurements were performed on a coronal oblique T1 sequence with a clear view of the acromion and humerus. Measurements were performed by an experienced radiologist twice at 3-day intervals and by ChatGPT-4 twice at 3-day intervals in different sessions. The first, second, and mean values of AHD measured by the physician were 7.6 ± 1.7, 7.5 ± 1.6, and 7.6 ± 1.7, respectively. The first, second, and mean values measured by ChatGPT-4 were 6.7 ± 0.8, 7.3 ± 1.1, and 7.1 ± 0.8, respectively. There was a significant difference between the physician and ChatGPT-4 between the first and mean measurements (p < 0.0001 and p = 0.009, respectively). However, there was no significant difference between the second measurements (p = 0.220). Intrarater reliability for the physician was excellent (ICC = 0.99); intrarater reliability for ChatGPT-4 was poor (ICC = 0.41). Interrater reliability was poor (ICC = 0.45). In conclusion, this study demonstrated that the reliability of ChatGPT-4 in AHD measurements is inferior to that of an experienced radiologist. This study may help improve the possible future contribution of large language models to medical science.

MRI Detection Musculoskeletal Retrospective Clinical In Silico Academic Lab GenAI

MSA-Net: a multi-scale and adversarial learning network for segmenting bone metastases in low-resolution SPECT imaging.

Wu Y, Lin Q, He Y, Zeng X, Cao Y, Man Z, Liu C, Hao Y, Cai Z, Ji J, Huang X

•papers•Jul 24 2025

Single-photon emission computed tomography (SPECT) plays a crucial role in detecting bone metastases from lung cancer. However, its low spatial resolution and lesion similarity to benign structures present significant challenges for accurate segmentation, especially for lesions of varying sizes. We propose a deep learning-based segmentation framework that integrates conditional adversarial learning with a multi-scale feature extraction generator. The generator employs cascade dilated convolutions, multi-scale modules, and deep supervision, while the discriminator utilizes multi-scale L1 loss computed on image-mask pairs to guide segmentation learning. The proposed model was evaluated on a dataset of 286 clinically annotated SPECT scintigrams. It achieved a Dice Similarity Coefficient (DSC) of 0.6671, precision of 0.7228, and recall of 0.6196 - outperforming both classical and recent adversarial segmentation models in multi-scale lesion detection, especially for small and clustered lesions. Our results demonstrate that the integration of multi-scale feature learning with adversarial supervision significantly improves the segmentation of bone metastasis in SPECT imaging. This approach shows potential for clinical decision support in the management of lung cancer.

SPECT Segmentation Musculoskeletal Retrospective Clinical In Silico

Deep learning reconstruction of zero echo time magnetic resonance imaging: diagnostic performance in axial spondyloarthritis.

Yi J, Hahn S, Lee HJ, Lee S, Park S, Lee J, de Arcos J, Fung M

•papers•Jul 24 2025

To compare the diagnostic performance of deep learning reconstruction (DLR) of zero echo time (ZTE) MRI for structural lesions in patients with axial spondyloarthritis, against T1WI and ZTE MRI without DLR, using CT as the reference standard. From February 2021 to December 2022, 26 patients (52 sacroiliac joints (SIJ) and 104 quadrants) underwent SIJ MRIs. Three readers assessed overall image quality and structural conspicuity, scoring SIJs for structural lesions on T1WI, ZTE, and ZTE DLR 50%, 75%, and 100%, respectively. Diagnostic performance was evaluated using CT as the reference standard, and inter-reader agreement was assessed using weighted kappa. ZTE DLR 100% showed the highest image quality scores for readers 1 and 2, and the best structural conspicuity scores for all three readers. In readers 2 and 3, ZTE DLR 75% showed the best diagnostic performance for bone sclerosis, outperforming T1WI and ZTE (all p < 0.05). In all readers, ZTE DLR 100% showed superior diagnostic performance for bone erosion compared to T1WI and ZTE (all p < 0.01). For bone sclerosis, ZTE DLR 50% showed the highest kappa coefficients between readers 1 and 2 and between readers 1 and 3. For bone erosion, ZTE DLR 100% showed the highest kappa coefficients between readers. ZTE MRI with DLR outperformed T1WI and ZTE MRI without DLR in diagnosing bone sclerosis and erosion of the SIJ, while offering similar subjective image quality and structural conspicuity. Question With zero echo time (ZTE) alone, small structural lesions, such as bone sclerosis and erosion, are challenging to confirm in axial spondyloarthritis. Findings ZTE deep learning reconstruction (DLR) showed higher diagnostic performance for detecting bone sclerosis and erosion, compared with T1WI and ZTE. Clinical relevance Applying DLR to ZTE enhances diagnostic capability for detecting bone sclerosis and erosion in the sacroiliac joint, aiding in the early diagnosis of axial spondyloarthritis.

MRI Reconstruction Musculoskeletal Retrospective Clinical In Silico Academic Lab

Analyzing pediatric forearm X-rays for fracture analysis using machine learning.

Lam V, Parida A, Dance S, Tabaie S, Cleary K, Anwar SM

•papers•Jul 24 2025

Forearm fractures constitute a significant proportion of emergency department presentations in pediatric population. The treatment goal is to restore length and alignment between the distal and proximal bone fragments. While immobilization through splinting or casting is enough for non-displaced and minimally displaced fractures. However, moderately or severely displaced fractures often require reduction for realignment. However, appropriate treatment in current practices has challenges due to the lack of resources required for specialized pediatric care leading to delayed and unnecessary transfers between medical centers, which potentially create treatment complications and burdens. The purpose of this study is to build a machine learning model for analyzing forearm fractures to assist clinical centers that lack surgical expertise in pediatric orthopedics. X-ray scans from 1250 children were curated, preprocessed, and manually annotated at our clinical center. Several machine learning models were fine-tuned using a pretraining strategy leveraging self-supervised learning model with vision transformer backbone. We further employed strategies to identify the most important region related to fractures within the forearm X-ray. The model performance was evaluated with and without region of interest (ROI) detection to find an optimal model for forearm fracture analyses. Our proposed strategy leverages self-supervised pretraining (without labels) followed by supervised fine-tuning (with labels). The fine-tuned model using regions cropped with ROI identification resulted in the highest classification performance with a true-positive rate (TPR) of 0.79, true-negative rate (TNR) of 0.74, AUROC of 0.81, and AUPR of 0.86 when evaluated on the testing data. The results showed the feasibility of using machine learning models in predicting the appropriate treatment for forearm fractures in pediatric cases. With further improvement, the algorithm could potentially be used as a tool to assist non-specialized orthopedic providers in diagnosing and providing treatment.

X-Ray Classification Musculoskeletal Methodology In Silico Academic Lab

Kissing Spine and Other Imaging Predictors of Postoperative Cement Displacement Following Percutaneous Kyphoplasty: A Machine Learning Approach.

Zhao Y, Bo L, Qian L, Chen X, Wang Y, Cui L, Xin Y, Liu L

•papers•Jul 23 2025

To investigate the risk factors associated with postoperative cement displacement following percutaneous kyphoplasty (PKP) in patients with osteoporotic vertebral compression fractures (OVCF) and to develop predictive models for clinical risk assessment. This retrospective study included 198 patients with OVCF who underwent PKP. Imaging and clinical variables were collected. Multiple machine learning models, including logistic regression, L1- and L2-regularized logistic regression, support vector machine (SVM), decision tree, gradient boosting, and random forest, were developed to predict cement displacement. L1- and L2-regularized logistic regression models identified four key risk factors: kissing spine (L1: 1.11; L2: 0.91), incomplete anterior cortex (L1: -1.60; L2: -1.62), low vertebral body CT value (L1: -2.38; L2: -1.71), and large Cobb change (L1: 0.89; L2: 0.87). The support vector machine (SVM) model achieved the best performance (accuracy: 0.983, precision: 0.875, recall: 1.000, F1-score: 0.933, specificity: 0.981, AUC: 0.997). Other models, including logistic regression, decision tree, gradient boosting, and random forest, also showed high performance but were slightly inferior to SVM. Key predictors of cement displacement were identified, and machine learning models were developed for risk assessment. These findings can assist clinicians in identifying high-risk patients, optimizing treatment strategies, and improving patient outcomes.

CT Classification Musculoskeletal Retrospective Clinical In Silico Academic Lab

Deep learning algorithm for the automatic assessment of axial vertebral rotation in patients with scoliosis using the Nash-Moe method.

Kim JK, Wang MX, Park D, Chang MC

•papers•Jul 22 2025

Accurate assessments of axial vertebral rotation (AVR) is essential for managing idiopathic scoliosis. The Nash-Moe classification method has been extensively used for AVR assessment; however, its subjective nature can lead to measurement variability. Therefore, herein, we propose an automated deep learning (DL) model for AVR assessment based on posteroanterior spinal radiographs. We develop a two-stage DL framework using the MMRotate toolbox and analyze 1080 posteroanterior spinal radiographs of patients aged 4-18 years. The framework comprises a vertebra detection model (864 training and 216 validation images) and a pedicle detection model (14,608 training and 3652 validation images). We improved the Nash-Moe classification method by implementing a 12-segment division system and width ratio metric for precise pedicle assessment. The vertebra and pedicle detection models achieved mean average precision values of 0.909 and 0.905, respectively. The overall classification accuracy was 0.74, with grade-specific performance between 0.70 and 1.00 for precision and 0.33 and 0.93 for recall across Grades 0-3. The proposed DL framework processed complete posteroanterior radiographs in < 5 s per case compared with conventional manual measurements (114 s per radiograph). The best performance was observed in mild to moderate rotation cases, with performance in severe rotation cases limited by insufficient data. The implementation of DL framework for the automated Nash-Moe classification method exhibited satisfactory accuracy and exceptional efficiency. However, this study is limited by low recall (0.33) for Grade 3 and the inability to classify Grade 4 towing to dataset constraints. Further validation using augmented datasets that include severe rotation cases is necessary.

X-Ray Detection Musculoskeletal Methodology In Silico

The safety and accuracy of radiation-free spinal navigation using a short, scoliosis-specific BoneMRI-protocol, compared to CT.

Lafranca PPG, Rommelspacher Y, Walter SG, Muijs SPJ, van der Velden TA, Shcherbakova YM, Castelein RM, Ito K, Seevinck PR, Schlösser TPC

•papers•Jul 21 2025

Spinal navigation systems require pre- and/or intra-operative 3-D imaging, which expose young patients to harmful radiation. We assessed a scoliosis-specific MRI-protocol that provides T2-weighted MRI and AI-generated synthetic-CT (sCT) scans, through deep learning algorithms. This study aims to compare MRI-based synthetic-CT spinal navigation to CT for safety and accuracy of pedicle screw planning and placement at thoracic and lumbar levels. Spines of 5 cadavers were scanned with thin-slice CT and the scoliosis-specific MRI-protocol (to create sCT). Preoperatively, on both CT and sCT screw trajectories were planned. Subsequently, four spine surgeons performed surface-matched, navigated placement of 2.5 mm k-wires in all pedicles from T3 to L5. Randomization for CT/sCT, surgeon and side was performed (1:1 ratio). On postoperative CT-scans, virtual screws were simulated over k-wires. Maximum angulation, distance between planned and postoperative screw positions and medial breach rate (Gertzbein-Robbins classification) were assessed. 140 k-wires were inserted, 3 were excluded. There were no pedicle breaches > 2 mm. Of sCT-guided screws, 59 were grade A and 10 grade B. For the CT-guided screws, 47 were grade A and 21 grade B (p = 0.022). Average distance (± SD) between intraoperative and postoperative screw positions was 2.3 ± 1.5 mm in sCT-guided screws, and 2.4 ± 1.8 mm for CT (p = 0.78), average maximum angulation (± SD) was 3.8 ± 2.5° for sCT and 3.9 ± 2.9° for CT (p = 0.75). MRI-based, AI-generated synthetic-CT spinal navigation allows for safe and accurate planning and placement of thoracic and lumbar pedicle screws in a cadaveric model, without significant differences in distance and angulation between planned and postoperative screw positions compared to CT.

Mixed Modality Image Synthesis Musculoskeletal Retrospective Clinical Phantom/Animal Academic Lab Reproducibility

Trueness of artificial intelligence-based, manual, and global thresholding segmentation protocols for human mandibles.

Hernandez AKT, Dutra V, Chu TG, Yang CC, Lin WS

•papers•Jul 21 2025

To compare the trueness of artificial intelligence (AI)-based, manual, and global segmentation protocols by superimposing the resulting segmented 3D models onto reference gold standard surface scan models. Twelve dry human mandibles were used. A cone beam computed tomography (CBCT) scanner was used to scan the mandibles, and the acquired digital imaging and communications in medicine (DICOM) files were segmented using three protocols: global thresholding, manual, and AI-based segmentation (Diagnocat; Diagnocat, San Francisco, CA). The segmented files were exported as study 3D models. A structured light surface scanner (GoSCAN Spark; Creaform 3D, Levis, Canada) was used to scan all mandibles, and the resulting reference 3D models were exported. The study 3D models were compared with the respective reference 3D models by using a mesh comparison software (Geomagic Design X; 3D Systems Inc, Rock Hill, SC). Root mean square (RMS) error values were recorded to measure the magnitude of deviation (trueness), and color maps were obtained to visualize the differences. Comparisons of the trueness of three segmentation methods for differences in RMS were made using repeated measures analysis of variance (ANOVA). A two-sided 5% significance level was used for all tests in the software program. AI-based segmentations had significantly higher RMS values than manual segmentations for the entire mandible (p < 0.001), alveolar process (p < 0.001), and body of the mandible (p < 0.001). AI-based segmentations had significantly lower RMS values than manual segmentations for the condyles (p = 0.018) and ramus (p = 0.013). No significant differences were found between the AI-based and manual segmentations for the coronoid process (p = 0.275), symphysis (p = 0.346), and angle of the mandible (p = 0.344). Global thresholding had significantly higher RMS values than manual segmentations for the alveolus (p < 0.001), angle of the mandible (p < 0.001), body of the mandible (p < 0.001), condyles (p < 0.001), coronoid (p = 0.002), entire mandible (p < 0.001), ramus (p < 0.001), and symphysis (p < 0.001). Global thresholding had significantly higher RMS values than AI-based segmentation for the alveolar process (p = 0.002), angle of the mandible (p < 0.001), body of the mandible (p < 0.001), condyles (p < 0.001), coronoid (p = 0.017), mandible (p < 0.001), ramus (p < 0.001), and symphysis (p < 0.001). AI-based segmentations produced lower RMS values, indicating truer 3D models, compared to global thresholding, and showed no significant differences in some areas compared to manual segmentation. Thus, AI-based segmentation offers a level of segmentation trueness acceptable for use as an alternative to manual or global thresholding segmentation protocols.

CT Segmentation Musculoskeletal Retrospective Clinical In Silico Startup

Automated extraction of vertebral bone mineral density from imaging with various scan parameters: a cadaver study with correlation to quantitative computed tomography.

Ramschütz C, Kloth C, Vogele D, Baum T, Rühling S, Beer M, Jansen JU, Schlager B, Wilke HJ, Kirschke JS, Sollmann N

•papers•Jul 21 2025

To investigate lumbar vertebral volumetric bone mineral density (vBMD) from ex vivo opportunistic multi-detector computed tomography (MDCT) scans using different protocols, and compare it to dedicated quantitative CT (QCT) values from the same specimens. Cadavers from two female donors (ages 62 and 68 years) were scanned (L1-L5) using six different MDCT protocols and one dedicated QCT scan. Opportunistic vBMD was extracted using an artificial intelligence-based algorithm. The vBMD measurements from the six MDCT protocols, which varied in peak tube voltage (80-140 kVp), tube load (72-200 mAs), slice thickness (0.75-1 mm), and/or slice increment (0.5-0.75 mm), were compared to those obtained from dedicated QCT. A strong positive correlation was observed between vBMD from opportunistic MDCT and reference QCT (ρ = 0.869, p < 0.01). Agreement between vBMD measurements from MDCT protocols and the QCT reference standard according to the intraclass correlation coefficient (ICC) was 0.992 (95% confidence interval [CI]: 0.982-0.998). Bland-Altman analysis showed biases ranging from - 12.66 to 8.00 mg/cm³ across the six MDCT protocols, with all data points falling within the respective limits of agreement (LOA) for both cadavers. Opportunistic vBMD measurements of lumbar vertebrae demonstrated reliable consistency ex vivo across various scan parameters when compared to dedicated QCT.

CT Segmentation Musculoskeletal Retrospective Clinical In Silico Academic Lab

Advances and challenges in AI-assisted MRI for lumbar disc degeneration detection and classification.

Could a New Method of Acromiohumeral Distance Measurement Emerge? Artificial Intelligence vs. Physician.

MSA-Net: a multi-scale and adversarial learning network for segmenting bone metastases in low-resolution SPECT imaging.

Deep learning reconstruction of zero echo time magnetic resonance imaging: diagnostic performance in axial spondyloarthritis.

Analyzing pediatric forearm X-rays for fracture analysis using machine learning.

Kissing Spine and Other Imaging Predictors of Postoperative Cement Displacement Following Percutaneous Kyphoplasty: A Machine Learning Approach.

Deep learning algorithm for the automatic assessment of axial vertebral rotation in patients with scoliosis using the Nash-Moe method.

The safety and accuracy of radiation-free spinal navigation using a short, scoliosis-specific BoneMRI-protocol, compared to CT.

Trueness of artificial intelligence-based, manual, and global thresholding segmentation protocols for human mandibles.

Automated extraction of vertebral bone mineral density from imaging with various scan parameters: a cadaver study with correlation to quantitative computed tomography.

Ready to Sharpen Your Edge?