Automated ultrasound system ARTHUR V.2.0 with AI analysis DIANA V.2.0 matches expert rheumatologist in hand joint assessment of rheumatoid arthritis patients.
Authors
Affiliations (6)
Affiliations (6)
- Section of Rheumatology, Department of Medicine, Svendborg Hospital - Odense University Hospital, Svendborg, Denmark.
- Center for Treatment of Rheumatic and Musculoskeletal Diseases (REMEDY), Diakonhjemmet Hospital, Oslo, Norway.
- Center for Rheumatology and Spine Diseases, Rigshospitalet Glostrup, Glostrup, Denmark.
- The Maersk Mc-Kinney Moller Institute, Syddansk Universitet, Odense, Denmark.
- ROPCA, Odense, Denmark.
- Section of Rheumatology, Department of Medicine, Svendborg Hospital - Odense University Hospital, Svendborg, Denmark [email protected].
Abstract
To evaluate the agreement and repeatability of an automated robotic ultrasound system (ARTHUR V.2.0) combined with an AI model (DIANA V.2.0) in assessing synovial hypertrophy (SH) and Doppler activity in rheumatoid arthritis (RA) patients, using an expert rheumatologist's assessment as the reference standard. 30 RA patients underwent two consecutive ARTHUR V.2.0 scans and rheumatologist assessment of 22 hand joints, with the rheumatologist blinded to the automated system's results. Images were scored for SH and Doppler by DIANA V.2.0 using the EULAR-OMERACT scale (0-3). The agreement was evaluated by weighted Cohen's kappa, percent exact agreement (PEA), percent close agreement (PCA) and binary outcomes using Global OMERACT-EULAR Synovitis Scoring (healthy ≤1 vs diseased ≥2). Comparisons included intra-robot repeatability and agreement with the expert rheumatologist and a blinded independent assessor. ARTHUR successfully scanned 564 out of 660 joints, corresponding to an overall success rate of 85.5%. Intra-robot agreement for SH: PEA 63.0%, PCA 93.0%, binary 90.5% and for Doppler, PEA 74.8%, PCA 93.7%, binary 88.1% and kappa values of 0.54 and 0.49. Agreement between ARTHUR+DIANA and the rheumatologist: SH (PEA 57.9%, PCA 92.9%, binary 87.3%, kappa 0.38); Doppler (PEA 77.3%, PCA 94.2%, binary 91.2%, kappa 0.44) and with the independent assessor: SH (PEA 49.0%, PCA 91.2%, binary 80.0%, kappa 0.39); Doppler (PEA 62.6%, PCA 94.4%, binary 88.1%, kappa 0.48). ARTHUR V.2.0 and DIANA V.2.0 demonstrated repeatability on par with intra-expert agreement reported in the literature and showed encouraging agreement with human assessors, though further refinement is needed to optimise performance across specific joints.