Latest Papers on Radiology AI. Tags: Musculoskeletal, Order: Best Match, Limit: 10.

Impact of AI assistance on radiologist interpretation of knee MRI.

Herpe G, Vesoul T, Zille P, Pluot E, Guillin R, Rizk B, Ardon R, Adam C, d'Assignies G, Gondim Teixeira PA

•papers•Jul 31 2025

Knee injuries frequently require Magnetic Resonance Imaging (MRI) evaluation, increasing radiologists' workload. This study evaluates the impact of a Knee AI assistant on radiologists' diagnostic accuracy and efficiency in detecting anterior cruciate ligament (ACL), meniscus, cartilage, and medial collateral ligament (MCL) lesions on knee MRI exams. This retrospective reader study was conducted from January 2024 to April 2024. Knee MRI studies were evaluated with and without AI assistance by six radiologists with between 2 and 10 years of experience in musculoskeletal imaging in two sessions, 1 month apart. The AI algorithm was trained on 23,074 MRI studies separate from the study dataset and tested on various knee structures, including ACL, MCL, menisci, and cartilage. The reference standard was established by the consensus of three expert MSK radiologists. Statistical analysis included sensitivity, specificity, accuracy, and Fleiss' Kappa. The study dataset involved 165 knee MRIs (89 males, 76 females; mean age, 42.3 ± 15.7 years). AI assistance improved sensitivity from 81% (134/165, 95% CI = [79.7, 83.3]) to 86%(142/165, 95% CI = [84.2, 87.5]) (p < 0.001), accuracy from 86% (142/165, 95% CI = [85.4, 86.9]) to 91%(150/165, 95% CI = [90.7, 92.1]) (p < 0.001), and specificity from 88% (145/165, 95% CI = [87.1, 88.5]) to 93% (153/165, 95% CI = [92.7, 93.8]) (p < 0.001). Sensitivity and accuracy improvements were observed across all knee structures with varied statistical significance ranging from < 0.001 to 0.28. The Fleiss' Kappa values among readers increased from 54% (95% CI = [53.0, 55.3]) to 78% (95% CI = [76.6, 79.0]) (p < 0.001) post-AI integration. The integration of AI improved diagnostic accuracy, efficiency, and inter-reader agreement in knee MRI interpretation, highlighting the value of this approach in clinical practice. Question Can artificial intelligence (AI) assistance improve the diagnostic accuracy and efficiency of radiologists in detecting main lesions anterior cruciate ligament, meniscus, cartilage, and medial collateral ligament lesions in knee MRI? Findings AI assistance in knee MRI interpretation increased radiologists' sensitivity from 81 to 86% and accuracy from 86 to 91% for detecting knee lesions while improving inter-reader agreement (p < 0.001). Clinical relevance AI-assisted knee MRI interpretation enhances diagnostic precision and consistency among radiologists, potentially leading to more accurate injury detection, improved patient outcomes, and reduced diagnostic variability in musculoskeletal imaging.

MRI Detection Musculoskeletal Retrospective Clinical Clinical Pilot

Trabecular bone analysis: ultra-high-resolution CT goes far beyond high-resolution CT and gets closer to micro-CT (a study using Canon Medical CT devices).

Gillet R, Puel U, Amer A, Doyen M, Boubaker F, Assabah B, Hossu G, Gillet P, Blum A, Teixeira PAG

•papers•Jul 30 2025

High-resolution CT (HR-CT) cannot image trabecular bone due to insufficient spatial resolution. Ultra-high-resolution CT may be a valuable alternative. We aimed to describe the accuracy of Canon Medical HR, super-high-resolution (SHR), and ultra-high-resolution (UHR)-CT in measuring trabecular bone microarchitectural parameters using micro-CT as a reference. Sixteen cadaveric distal tibial epiphyses were enrolled in this pre-clinical study. Images were acquired with HR-CT (i.e., 0.5 mm slice thickness/5122 matrix) and SHR-CT (i.e., 0.25 mm slice thickness and 10242 matrix) with and without deep learning reconstruction (DLR) and UHR-CT (i.e., 0.25 mm slice thickness/20482 matrix) without DLR. Trabecular bone parameters were compared. Trabecular thickness was closest with UHR-CT but remained 1.37 times that of micro-CT (P < 0.001). With SHR-CT without and with DLR, it was 1.75 and 1.79 times that of micro-CT, respectively (P < 0.001), and 3.58 and 3.68 times that of micro-CT with HR-CT without and with DLR, respectively (P < 0.001). Trabecular separation was 0.7 times that of micro-CT with UHR-CT (P < 0.001), 0.93 and 0.94 times that of micro-CT with SHR-CT without and with DLR (P = 0.36 and 0.79, respectively), and 1.52 and 1.36 times that of micro-CT with HR-CT without and with DLR (P < 0.001). Bone volume/total volume was overestimated (i.e., 1.66 to 1.92 times that of micro-CT) by all techniques (P < 0.001). However, HR-CT values were superior to UHR-CT values (P = 0.03 and 0.01, without and with DLR, respectively). UHR and SHR-CT were the closest techniques to micro-CT and surpassed HR-CT.

CT Reconstruction Musculoskeletal Methodology Phantom/Animal Academic Lab Benchmark SOTA

Use of imaging biomarkers and ambulatory functional endpoints in Duchenne muscular dystrophy clinical trials: Systematic review and machine learning-driven trend analysis.

Todd M, Kang S, Wu S, Adhin D, Yoon DY, Willcocks R, Kim S

•papers•Jul 29 2025

Duchenne muscular dystrophy (DMD) is a rare X-linked genetic muscle disorder affecting primarily pediatric males and leading to limited life expectancy. This systematic review of 85 DMD trials and non-interventional studies (2010-2022) evaluated how magnetic resonance imaging biomarkers-particularly fat fraction and T2 relaxation time-are currently being used to quantitatively track disease progression and how their use compares to traditional mobility-based functional endpoints. Imaging biomarker studies lasted on average 4.50 years, approximately 11 months longer than those using only ambulatory functional endpoints. While 93% of biologic intervention trials (n = 28) included ambulatory functional endpoints, only 13.3% (n = 4) incorporated imaging biomarkers. Small molecule trials and natural history studies were the predominant contributors to imaging biomarker use, each comprising 30.4% of such studies. Small molecule trials used imaging biomarkers more frequently than biologic trials, likely because biologics often target dystrophin, an established surrogate biomarker, while small molecules lack regulatory-approved biomarkers. Notably, following the 2018 FDA guidance finalization, we observed a significant decrease in new trials using imaging biomarkers despite earlier regulatory encouragement. This analysis demonstrates that while imaging biomarkers are increasingly used in natural history studies, their integration into interventional trials remains limited. From XGBoost machine learning analysis, trial duration and start year were the strongest predictors of biomarker usage, with a decline observed following the 2018 FDA guidance. Despite their potential to objectively track disease progression, imaging biomarkers have not yet been widely adopted as primary endpoints in therapeutic trials, likely due to regulatory and logistical challenges. Future research should examine whether standardizing imaging protocols or integrating hybrid endpoint models could bridge the regulatory gap currently limiting biomarker adoption in therapeutic trials.

MRI Musculoskeletal Review In Silico Benchmark SOTA

Feature Selection in Healthcare Datasets: Towards a Generalizable Solution.

Maruotto I, Ciliberti FK, Gargiulo P, Recenti M

•papers•Jul 29 2025

The increasing dimensionality of healthcare datasets presents major challenges for clinical data analysis and interpretation. This study introduces a scalable ensemble feature selection (FS) strategy optimized for multi-biometric healthcare datasets aiming to: address the need for dimensionality reduction, identify the most significant features, improve machine learning models' performance, and enhance interpretability in a clinical context. The novel waterfall selection, that integrates sequentially (a) tree-based feature ranking and (b) greedy backward feature elimination, produces as output several sets of features. These subsets are then combined using a specific merging strategy to produce a single set of clinically relevant features. The overall method is applied to two healthcare datasets: the biosignal-based BioVRSea dataset, containing electromyography, electroencephalography, and center-of-pressure data for postural control and motion sickness assessment, and the image-based SinPain dataset, which includes MRI and CT-scan data to study knee osteoarthritis. Our ensemble FS approach demonstrated effective dimensionality reduction, achieving over a 50% decrease in certain feature subsets. The new reduced feature set maintained or improved the model classification metrics when tested with Support Vector Machine and Random Forest models. The proposed ensemble FS method retains selected features essential for distinguishing clinical outcomes, leading to models that are both computationally efficient and clinically interpretable. Furthermore, the adaptability of this method across two heterogeneous healthcare datasets and the scalability of the algorithm indicates its potential as a generalizable tool in healthcare studies. This approach can advance clinical decision support systems, making high-dimensional healthcare datasets more accessible and clinically interpretable.

Mixed Modality Classification Musculoskeletal Methodology In Silico

Gout Diagnosis From Ultrasound Images Using a Patch-Wise Attention Deep Network.

Zhao Y, Xiao L, Liu H, Li Y, Ning C, Liu M

•papers•Jul 29 2025

The rising global prevalence of gout necessitates advancements in diagnostic methodologies. Ultrasonographic imaging of the foot has become an important diagnostic modality for gout because of its non-invasiveness, cost-effectiveness, and real-time imaging capabilities. This study aims to develop and validate a deep learning-based artificial intelligence (AI) model for automated gout diagnosis using ultrasound images. In this study, ultrasound images were primarily acquired at the first metatarsophalangeal joint (MTP1) from 598 cases in two institutions: 520 from Institution 1 and 78 from Institution 2. From Institution 1's dataset, 66% of cases were randomly allocated for model training, while the remaining 34% constitute the internal test set. The dataset from Institution 2 served as an independent external validation cohort. A novel deep learning model integrating a patch-wise attention mechanism and multi-scale feature extraction was developed to enhance the detection of subtle sonographic features and optimize diagnostic performance. The proposed model demonstrated robust diagnostic efficacy, achieving an accuracy of 87.88%, a sensitivity of 87.85%, a specificity of 87.93%, and an area under the curve (AUC) of 93.43%. Additionally, the model generates interpretable visual heatmaps to localize gout-related pathological features, thereby facilitating interpretation for clinical decision-making. In this paper, a deep learning-based artificial intelligence (AI) model was developed for the automated detection of gout using ultrasound images, which achieved better performance than other models. Furthermore, the features highlighted by the model align closely with expert assessments, demonstrating its potential to assist in the ultrasound-based diagnosis of gout.

Ultrasound Classification Musculoskeletal Retrospective Clinical In Silico

Evaluation and analysis of risk factors for fractured vertebral recompression post-percutaneous kyphoplasty: a retrospective cohort study based on logistic regression analysis.

Zhao Y, Li B, Qian L, Chen X, Wang Y, Cui L, Xin Y, Liu L

•papers•Jul 29 2025

Vertebral recompression after percutaneous kyphoplasty (PKP) for osteoporotic vertebral compression fractures (OVCFs) may lead to recurrent pain, deformity, and neurological impairment, compromising prognosis and quality of life. To identify independent risk factors for postoperative recompression and develop predictive models for risk assessment. We retrospectively analyzed 284 OVCF patients treated with PKP, grouped by recompression status. Predictors were screened using univariate and correlation analyses. Multicollinearity was assessed using variance inflation factor (VIF). A multivariable logistic regression model was constructed and validated via 10-fold cross-validation and temporal validation. Five independent predictors were identified: incomplete anterior cortex (odds ratio [OR] = 9.38), high paravertebral muscle fat infiltration (OR = 218.68), low vertebral CT value (OR = 0.87), large Cobb change (OR = 1.45), and high vertebral height recovery rate (OR = 22.64). The logistic regression model achieved strong performance: accuracy 97.67%, precision 97.06%, recall 97.06%, F1 score 97.06%, specificity 98.08%, area under the receiver operating characteristic curve (AUC) 0.998. Machine learning models (e.g., random forest) were also evaluated but did not outperform logistic regression in accuracy or interpretability. Five imaging-based predictors of vertebral recompression were identified. The logistic regression model showed excellent predictive accuracy and generalizability, supporting its clinical utility for early risk stratification and personalized decision-making in OVCF patients undergoing PKP.

CT Classification Musculoskeletal Retrospective Clinical In Silico

Diabetes and longitudinal changes in deep learning-derived measures of vertebral bone mineral density using conventional CT: the Multi-Ethnic Study of Atherosclerosis.

Ghotbi E, Hadidchi R, Hathaway QA, Bancks MP, Bluemke DA, Barr RG, Smith BM, Post WS, Budoff M, Lima JAC, Demehri S

•papers•Jul 29 2025

To investigate the longitudinal association between diabetes and changes in vertebral bone mineral density (BMD) derived from conventional chest CT and to evaluate whether kidney function (estimated glomerular filtration rate (eGFR)) modifies this relationship. This longitudinal study included 1046 participants from the Multi-Ethnic Study of Atherosclerosis Lung Study with vertebral BMD measurements from chest CTs at Exam 5 (2010-2012) and Exam 6 (2016-2018). Diabetes was classified based on the American Diabetes Association criteria, and those with impaired fasting glucose (i.e., prediabetes) were excluded. Volumetric BMD was derived using a validated deep learning model to segment trabecular bone of thoracic vertebrae. Linear mixed-effects models estimated the association between diabetes and BMD changes over time. Following a significant interaction between diabetes status and eGFR, additional stratified analyses examined the impact of kidney function (i.e., diabetic nephropathy), categorized by eGFR (≥ 60 vs. < 60 mL/min/body surface area). Participants with diabetes had a higher baseline vertebral BMD than those without (202 vs. 190 mg/cm3) and experienced a significant increase over a median followpup of 6.2 years (β = 0.62 mg/cm3/year; 95% CI 0.26, 0.98). This increase was more pronounced among individuals with diabetes and reduced kidney function (β = 1.52 mg/cm3/year; 95% CI 0.66, 2.39) compared to the diabetic individuals with preserved kidney function (β = 0.48 mg/cm3/year; 95% CI 0.10, 0.85). Individuals with diabetes exhibited an increase in vertebral BMD over time in comparison to the non-diabetes group which is more pronounced in those with diabetic nephropathy. These findings suggest that conventional BMD measurements may not fully capture the well-known fracture risk in diabetes. Further studies incorporating bone microarchitecture using advanced imaging and fracture outcomes are needed to refine skeletal health assessments in the diabetic population.

CT Segmentation Musculoskeletal Retrospective Clinical In Silico Academic Lab

Towards trustworthy artificial intelligence in musculoskeletal medicine: A narrative review on uncertainty quantification.

Vahdani AM, Shariatnia M, Rajpurkar P, Pareek A

•papers•Jul 28 2025

Deep learning (DL) models have achieved remarkable performance in musculoskeletal (MSK) medical imaging research, yet their clinical integration remains hindered by their black-box nature and the absence of reliable confidence measures. Uncertainty quantification (UQ) seeks to bridge this gap by providing each DL prediction with a calibrated estimate of uncertainty, thereby fostering clinician trust and safer deployment. We conducted a targeted narrative review, performing expert-driven searches in PubMed, Scopus, and arXiv and mining references from relevant publications in MSK imaging utilizing UQ, and a thematic synthesis was used to derive a cohesive taxonomy of UQ methodologies. UQ approaches encompass multi-pass methods (e.g., test-time augmentation, Monte Carlo dropout, and model ensembling) that infer uncertainty from variability across repeated inferences; single-pass methods (e.g., conformal prediction, and evidential deep learning) that augment each individual prediction with uncertainty metrics; and other techniques that leverage auxiliary information, such as inter-rater variability, hidden-layer activations, or generative reconstruction errors, to estimate confidence. Applications in MSK imaging, include highlighting uncertain areas in cartilage segmentation and identifying uncertain predictions in joint implant design detections; downstream applications include enhanced clinical utility and more efficient data annotation pipelines. Embedding UQ into DL workflows is essential for translating high-performance models into clinical practice. Future research should prioritize robust out-of-distribution handling, computational efficiency, and standardized evaluation metrics to accelerate the adoption of trustworthy AI in MSK medicine. Not applicable.

Mixed Modality Segmentation Musculoskeletal Review Concept Ethics

Advances and challenges in AI-assisted MRI for lumbar disc degeneration detection and classification.

Zhao P, Zhu S

•papers•Jul 25 2025

Intervertebral disc degeneration (IDD) is a major contributor to chronic low back pain. Magnetic resonance imaging (MRI) serves as the gold standard for IDD assessment, yet manual grading is often subjective and inconsistent. With advances in artificial intelligence (AI), particularly deep learning, automated detection and classification of IDD from MRI has become increasingly feasible. This narrative review aims to provide a comprehensive overview of AI applications-especially machine learning and deep learning techniques-for MRI-based detection and grading of lumbar disc degeneration, highlighting their clinical value, current limitations, and future directions. Relevant studies were reviewed and summarized based on thematic structure. The review covers classical methods (e.g., support vector machines), deep learning models (e.g., CNNs, SpineNet, ResNet, U-Net), and hybrid approaches incorporating transformers and multitask learning. Technical details, model architectures, performance metrics, and representative datasets were synthesized and discussed. AI systems have demonstrated promising performance in automatic IDD grading, in some cases matching or surpassing expert radiologists. CNN-based models showed high accuracy and reproducibility, while hybrid models further enhanced segmentation and classification tasks. However, challenges remain in generalizability, data imbalance, interpretability, and regulatory integration. Tools such as Grad-CAM and SHAP improve model transparency, while methods like few-shot learning and data augmentation can alleviate data limitations. AI-assisted analysis of MRI for lumbar disc degeneration offers significant potential to enhance diagnostic efficiency and consistency. While current models are encouraging, real-world clinical implementation requires further advancements in interpretability, data diversity, ethical standards, and large-scale validation.

MRI Classification Musculoskeletal Review In Silico

Enhancing the Characterization of Dural Tears on Photon Counting CT Myelography: An Analysis of Reconstruction Techniques.

Madhavan AA, Kranz PG, Kodet ML, Yu L, Zhou Z, Amrhein TJ

•papers•Jul 25 2025

Photon counting detector CT myelography is an effective modality for the localization of spinal CSF leaks. The initial studies describing this technique employed a relatively smooth Br56 kernel. However, subsequent studies have demonstrated that the use of the sharpest quantitative kernel on photon counting CT (Qr89), particularly when denoised with techniques such as quantum iterative reconstruction or convolutional neural networks, enhances detection of CSF-venous fistulas. In this clinical report, we sought to determine whether the Qr89 kernel has utility in patients with dural tears, the other main type of spinal CSF leak. We performed a retrospective review of patients with dural tears diagnosed on photon counting CT myelography, comparing Br56, Qr89 denoised with quantum iterative reconstruction, and Qr89 denoised with a trained convolutional neural network. We specifically assessed spatial resolution, noise level, and diagnostic confidence in eight such cases, finding that the sharper Qr89 kernel outperformed the smoother Br56 kernel. This was particularly true when Qr89 was denoised using a convolutional neural network. Furthermore, in two cases, the dural tear was only seen on the Qr89 reconstructions and missed on the Br56 kernel. Overall, our study demonstrates the potential value of further optimizing post-processing techniques for photon counting CT myelography aimed at localizing dural tears.ABBREVIATIONS: CNN = convolutional neural network; CVF = CSF-venous fistula; DSM = digital subtraction myelography; EID = energy integrating detector; PCD = photon counting detector; QIR = quantum iterative reconstruction.

CT Reconstruction Musculoskeletal Retrospective Clinical In Silico Academic Lab Benchmark SOTA Reproducibility

Impact of AI assistance on radiologist interpretation of knee MRI.

Trabecular bone analysis: ultra-high-resolution CT goes far beyond high-resolution CT and gets closer to micro-CT (a study using Canon Medical CT devices).

Use of imaging biomarkers and ambulatory functional endpoints in Duchenne muscular dystrophy clinical trials: Systematic review and machine learning-driven trend analysis.

Feature Selection in Healthcare Datasets: Towards a Generalizable Solution.

Gout Diagnosis From Ultrasound Images Using a Patch-Wise Attention Deep Network.

Evaluation and analysis of risk factors for fractured vertebral recompression post-percutaneous kyphoplasty: a retrospective cohort study based on logistic regression analysis.

Diabetes and longitudinal changes in deep learning-derived measures of vertebral bone mineral density using conventional CT: the Multi-Ethnic Study of Atherosclerosis.

Towards trustworthy artificial intelligence in musculoskeletal medicine: A narrative review on uncertainty quantification.

Advances and challenges in AI-assisted MRI for lumbar disc degeneration detection and classification.

Enhancing the Characterization of Dural Tears on Photon Counting CT Myelography: An Analysis of Reconstruction Techniques.

Ready to Sharpen Your Edge?