Sort by:
Page 11 of 33328 results

Leveraging Large Language Models for Accurate AO Fracture Classification from CT Text Reports.

Mergen M, Spitzl D, Ketzer C, Strenzke M, Marka AW, Makowski MR, Bressem KK, Adams LC, Gassert FT

pubmed logopapersJul 7 2025
Large language models (LLMs) have shown promising potential in analyzing complex textual data, including radiological reports. These models can assist clinicians, particularly those with limited experience, by integrating and presenting diagnostic criteria within radiological classifications. However, before clinical adoption, LLMs must be rigorously validated by medical professionals to ensure accuracy, especially in the context of advanced radiological classification systems. This study evaluates the performance of four LLMs-ChatGPT-4o, AmbossGPT, Claude 3.5 Sonnet, and Gemini 2.0 Flash-in classifying fractures based on the AO classification system using CT reports. A dataset of 292 fictitious physician-generated CT reports, representing 310 fractures, was used to assess the accuracy of each LLM in AO fracture classification retrospectively. Performance was evaluated by comparing the models' classifications to ground truth labels, with accuracy rates analyzed across different fracture types and subtypes. ChatGPT-4o and AmbossGPT achieved the highest overall accuracy (74.6 and 74.3%, respectively), outperforming Claude 3.5 Sonnet (69.5%) and Gemini 2.0 Flash (62.7%). Statistically significant differences were observed in fracture type classification, particularly between ChatGPT-4o and Gemini 2.0 Flash (Δ12%, p < 0.001). While all models demonstrated strong bone recognition rates (90-99%), their accuracy in fracture subtype classification remained lower (71-77%), indicating limitations in nuanced diagnostic categorization. LLMs show potential in assisting radiologists with initial fracture classification, particularly in high-volume or resource-limited settings. However, their performance remains inconsistent for detailed subtype classification, highlighting the need for further refinement and validation before clinical integration in advanced diagnostic workflows.

Artificial Intelligence-Assisted Standard Plane Detection in Hip Ultrasound for Developmental Dysplasia of the Hip: A Novel Real-Time Deep Learning Approach.

Darilmaz MF, Demirel M, Altun HO, Adiyaman MC, Bilgili F, Durmaz H, Sağlam Y

pubmed logopapersJul 6 2025
Developmental dysplasia of the hip (DDH) includes a range of conditions caused by inadequate hip joint development. Early diagnosis is essential to prevent long-term complications. Ultrasound, particularly the Graf method, is commonly used for DDH screening, but its interpretation is highly operator-dependent and lacks standardization, especially in identifying the correct standard plane. This variability often leads to misdiagnosis, particularly among less experienced users. This study presents AI-SPS, an AI-based instant standard plane detection software for real-time hip ultrasound analysis. Using 2,737 annotated frames, including 1,737 standard and 1,000 non-standard examples extracted from 45 clinical ultrasound videos, we trained and evaluated two object detection models: SSD-MobileNet V2 and YOLOv11n. The software was further validated on an independent set of 934 additional frames (347 standard and 587 non-standard) from the same video sources. YOLOv11n achieved an accuracy of 86.3%, precision of 0.78, recall of 0.88, and F1-score of 0.83, outperforming SSD-MobileNet V2, which reached an accuracy of 75.2%. These results indicate that AI-SPS can detect the standard plane with expert-level performance and improve consistency in DDH screening. By reducing operator variability, the software supports more reliable ultrasound assessments. Integration with live systems and Graf typing may enable a fully automated DDH diagnostic workflow. Level of Evidence: Level III, diagnostic study.

Unveiling knee morphology with SHAP: shaping personalized medicine through explainable AI.

Cansiz B, Arslan S, Gültekin MZ, Serbes G

pubmed logopapersJul 5 2025
This study aims to enhance personalized medical assessments and the early detection of knee-related pathologies by examining the relationship between knee morphology and demographic factors such as age, gender, and body mass index. Additionally, gender-specific reference values for knee morphological features will be determined using explainable artificial intelligence (XAI). A retrospective analysis was conducted on the MRI data of 500 healthy knees aged 20-40 years. The study included various knee morphological features such as Distal Femoral Width (DFW), Lateral Femoral Condyler Width (LFCW), Intercondylar Femoral Width (IFW), Anterior Cruciate Ligament Width (ACLW), and Anterior Cruciate Ligament Length (ACLL). Machine learning models, including Decision Trees, Random Forests, Light Gradient Boosting, Multilayer Perceptron, and Support Vector Machines, were employed to predict gender based on these features. The SHapley Additive exPlanation was used to analyze feature importance. The learning models demonstrated high classification performance, with 83.2% (±5.15) for classification of clusters based on morphological feature and 88.06% (±4.8) for gender classification. These results validated that the strong correlation between knee morphology and gender. The study found that DFW is the most significant feature for gender prediction, with values below 78-79 mm range indicating females and values above this range indicating males. LFCW, IFW, ACLW, and ACLL also showed significant gender-based differences. The findings establish gender-specific reference values for knee morphological features, highlighting the impact of gender on knee morphology. These reference values can improve the accuracy of diagnoses and treatment plans tailored to each gender, enhancing personalized medical care.

Improving prediction of fragility fractures in postmenopausal women using random forest.

Mateo J, Usategui-Martín R, Torres AM, Campillo-Sánchez F, de Temiño ÁR, Gil J, Martín-Millán M, Hernandez JL, Pérez-Castrillón JL

pubmed logopapersJul 5 2025
Osteoporosis is a chronic disease characterized by a progressive decline in bone density and quality, leading to increased bone fragility and a higher susceptibility to fractures, even in response to minimal trauma. Osteoporotic fractures represent a major source of morbidity and mortality among postmenopausal women. This condition poses both clinical and societal challenges, as its consequences include a significant reduction in quality of life, prolonged dependency, and a substantial increase in healthcare costs. Therefore, the development of reliable tools for predicting fracture risk is essential for the effective management of affected patients. In this study, we developed a predictive model based on the Random Forest (RF) algorithm for risk stratification of fragility fractures, integrating clinical, demographic, and imaging variables derived from dual-energy X-ray absorptiometry (DXA) and 3D modeling. Two independent cohorts were analyzed: the HURH cohort and the Camargo cohort, enabling both internal and external validation of the model. The results showed that the RF model consistently outperformed other classification algorithms, including k-nearest neighbors (KNN), support vector machines (SVM), decision trees (DT), and Gaussian naive Bayes (GNB), demonstrating high accuracy, sensitivity, specificity, area under the ROC curve (AUC), and Matthews correlation coefficient (MCC). Additionally, variable importance analysis highlighted that previous fracture history, parathyroid hormone (PTH) levels, and lumbar spine T-score, along with other densitometric parameters, were key predictors of fracture risk. These findings suggest that the integration of advanced machine learning techniques with clinical and imaging data can optimize early identification of high-risk patients, enabling personalized preventive strategies and improving the clinical management of osteoporosis.

Quantifying features from X-ray images to assess early stage knee osteoarthritis.

Helaly T, Faisal TR, Moni ASB, Naznin M

pubmed logopapersJul 5 2025
Knee osteoarthritis (KOA) is a progressive degenerative joint disease and a leading cause of disability worldwide. Manual diagnosis of KOA from X-ray images is subjective and prone to inter- and intra-observer variability, making early detection challenging. While deep learning (DL)-based models offer automation, they often require large labeled datasets, lack interpretability, and do not provide quantitative feature measurements. Our study presents an automated KOA severity assessment system that integrates a pretrained DL model with image processing techniques to extract and quantify key KOA imaging biomarkers. The pipeline includes contrast limited adaptive histogram equalization (CLAHE) for contrast enhancement, DexiNed-based edge extraction, and thresholding for noise reduction. We design customized algorithms that automatically detect and quantify joint space narrowing (JSN) and osteophytes from the extracted edges. The proposed model quantitatively assesses JSN and finds the number of intercondylar osteophytes, contributing to severity classification. The system achieves accuracies of 88% for JSN detection, 80% for osteophyte identification, and 73% for KOA classification. Its key strength lies in eliminating the need for any expensive training process and, consequently, the dependency on labeled data except for validation. Additionally, it provides quantitative data that can support classification in other OA grading frameworks.

Ultrasound Imaging and Machine Learning to Detect Missing Hand Motions for Individuals Receiving Targeted Muscle Reinnervation for Nerve-Pain Prevention.

Moukarzel ARE, Fitzgerald J, Battraw M, Pereira C, Li A, Marasco P, Joiner WM, Schofield J

pubmed logopapersJul 4 2025
Targeted muscle reinnervation (TMR) was initially developed as a technique for bionic prosthetic control but has since become a widely adopted strategy for managing pain and preventing neuroma formation after amputation. This shift in TMR's motivation has influenced surgical approaches, in ways that may challenge conventional electromyography (EMG)-based prosthetic control. The primary goal is often to simply reinnervate nerves to accessible muscles. This contrasts the earlier, more complex TMR surgeries that optimize EMG signal detection by carefully selecting target muscles near the skin's surface and manipulate residual anatomy to electrically isolate muscle activity. Consequently, modern TMR surgeries can involve less consideration for factors such as the depth of the reinnervated muscles or electrical crosstalk between closely located reinnervated muscles, all of which can impair the effectiveness of conventional prosthetic control systems. We recruited 4 participants with TMR, varying levels of upper limb loss, and diverse sets of reinnervated muscles. Participants attempted performing movements with their missing hands and we used a muscle activity measurement technique that employs ultrasound imaging and machine learning (sonomyography) to classify the resulting muscle movements. We found that attempted missing hand movements resulted in unique patterns of deformation in the reinnervated muscles and applying a K-nearest neighbors machine learning algorithm, we could predict 4-10 hand movements for each participant with 83.3-99.4% accuracy. Our findings suggest that despite the shifting motivations for performing TMR surgery this new generation of the surgical procedure not only offers prophylactic benefits but also retains promising opportunities for bionic prosthetic control.

Progression risk of adolescent idiopathic scoliosis based on SHAP-Explained machine learning models: a multicenter retrospective study.

Fang X, Weng T, Zhang Z, Gong W, Zhang Y, Wang M, Wang J, Ding Z, Lai C

pubmed logopapersJul 4 2025
To develop an interpretable machine learning model, explained using SHAP, based on imaging features of adolescent idiopathic scoliosis extracted by convolutional neural networks (CNNs), in order to predict the risk of curve progression and identify the most accurate predictive model. This study included 233 patients with adolescent idiopathic scoliosis from three medical centers. CNNs were used to extract features from full-spine coronal X-ray images taken at three follow-up points for each patient. Imaging and clinical features from center 1 were analyzed using the Boruta algorithm to identify independent predictors. Data from center 1 were divided into training (80%) and testing (20%) sets, while data from centers 2 and 3 were used as external validation sets. Six machine learning models were constructed. Receiver operating characteristic (ROC) curves were plotted, and model performance was assessed by calculating the area under the curve (AUC), accuracy, sensitivity, and specificity in the training, testing, and external validation sets. The SHAP interpreter was used to analyze the most effective model. The six models yielded AUCs ranging from 0.565 to 0.989, accuracies from 0.600 to 0.968, sensitivities from 0.625 to 1.0, and specificities from 0.571 to 0.974. The XGBoost model achieved the best performance, with an AUC of 0.896 in the external validation set. SHAP analysis identified the change in the main Cobb angle between the second and first follow-ups [Cobb1(2−1)] as the most important predictor, followed by the main Cobb angle at the second follow-up (Cobb1-2) and the change in the secondary Cobb angle [Cobb2(2−1)]. The XGBoost model demonstrated the best predictive performance in the external validation cohort, confirming its preliminary stability and generalizability. SHAP analysis indicated that Cobb1(2−1) was the most important feature for predicting scoliosis progression. This model offers a valuable tool for clinical decision-making by enabling early identification of high-risk patients and supporting early intervention strategies through automated feature extraction and interpretable analysis. The online version contains supplementary material available at 10.1186/s12891-025-08841-3.

Deep learning-driven abbreviated knee MRI protocols: diagnostic accuracy in clinical practice.

Foti G, Spoto F, Spezia A, Romano L, Caia S, Camerani F, Benedetti D, Mignolli T

pubmed logopapersJul 4 2025
Deep learning (DL) reconstruction shows potential in reducing MRI acquisition times while preserving image quality, but the impact of varying acceleration factors on knee MRI diagnostic accuracy remains undefined. Evaluate diagnostic performance of twofold, fourfold, and sixfold DL-accelerated knee MRI protocols versus standard protocols. In this prospective study, 71 consecutive patients underwent knee MRI with standard, DL2, DL4, and DL6 accelerated protocols. Four radiologists assessed ligament tears, meniscal lesions, bone marrow edema, chondropathy, and extensor abnormalities. Sensitivity, specificity, and interobserver agreement were calculated. DL2 and DL4 demonstrated high diagnostic accuracy. For ACL tears, DL2/DL4 achieved 98-100% sensitivity/specificity, while DL6 showed reduced sensitivity (91-96%). In meniscal evaluation, DL2 maintained 96-100% sensitivity and 98-100% specificity; DL4 showed 94-98% sensitivity and 97-99% specificity. DL6 exhibited decreased sensitivity (82-92%) for subtle lesions. Bone marrow edema detection remained excellent across acceleration factors. Interobserver agreement was excellent for DL2/DL4 (W = 0.91-0.97) and good for DL6 (W = 0.78-0.89). DL2 protocols demonstrate performance nearly identical to standard protocols, while DL4 maintains acceptable diagnostic accuracy for most pathologies. DL6 shows reduced sensitivity for subtle abnormalities, particularly among less experienced readers. DL2 and DL4 protocols represent optimal balance between acquisition time reduction (50-75%) and diagnostic confidence.

Cross-validation of an artificial intelligence tool for fracture classification and localization on conventional radiography in Dutch population.

Ruitenbeek HC, Sahil S, Kumar A, Kushawaha RK, Tanamala S, Sathyamurthy S, Agrawal R, Chattoraj S, Paramasamy J, Bos D, Fahimi R, Oei EHG, Visser JJ

pubmed logopapersJul 3 2025
The aim of this study is to validate the effectiveness of an AI tool trained on Indian data in a Dutch medical center and to assess its ability to classify and localize fractures. Conventional radiographs acquired between January 2019 and November 2022 were analyzed using a multitask deep neural network. The tool, trained on Indian data, identified and localized fractures in 17 body parts. The reference standard was based on radiology reports resulting from routine clinical workflow and confirmed by an experienced musculoskeletal radiologist. The analysis included both patient-wise and fracture-wise evaluations, employing binary and Intersection over Union (IoU) metrics to assess fracture detection and localization accuracy. In total, 14,311 radiographs (median age, 48 years (range 18-98), 7265 male) were analyzed and categorized by body parts; clavicle, shoulder, humerus, elbow, forearm, wrist, hand and finger, pelvis, hip, femur, knee, lower leg, ankle, foot and toe. 4156/14,311 (29%) had fractures. The AI tool demonstrated overall patient-wise sensitivity, specificity, and AUC of 87.1% (95% CI: 86.1-88.1%), 87.1% (95% CI: 86.4-87.7%), and 0.92 (95% CI: 0.91-0.93), respectively. Fracture detection rate was 60% overall, ranging from 7% for rib fractures to 90% for clavicle fractures. This study validates a fracture detection AI tool on a Western-European dataset, originally trained on Indian data. While classification performance is robust on real clinical data, fracture-wise analysis reveals variability in localization accuracy, underscoring the need for refinement in fracture localization. AI may provide help by enabling optimal use of limited resources or personnel. This study evaluates an AI tool designed to aid in detecting fractures, possibly reducing reading time or optimization of radiology workflow by prioritizing fracture-positive cases. Cross-validation on a consecutive Dutch cohort confirms this AI tool's clinical robustness. The tool detected fractures with 87% sensitivity, 87% specificity, and 0.92 AUC. AI localizes 60% of fractures, the highest for clavicle (90%) and lowest for ribs (7%).
Page 11 of 33328 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.