JADE: Jawbone Lesion Diagnosis and Decision Supporting System
Authors
Affiliations (1)
Affiliations (1)
- Katholieke Universiteit Leuven
Abstract
ObjectivesTo develop and evaluate JADE, a proof-of-concept retrieval-augmented generation (RAG) diagnostic assistive system was designed to enhance large language model (LLM) reasoning for the assessment of jawbone lesions. This study examined whether integrating structured retrieval with GPT-5 improves diagnostic accuracy and stability compared with standalone LLMs. MethodsJADE was developed as a cloud-based application integrating GPT-5 with a curated oral radiology and pathology database using a hybrid semantic-keyword retrieval strategy. Clinical and radiographic characteristics were imported as a structured query to guide retrieval and support diagnostic reasoning. Performance was compared with standalone GPT-5, Claude Sonnet 4.5, DeepSeek-R1, and Gemini 2.5 Flash across 25 cases. Accuracy was analysed using Cochrans Q test with post-hoc McNemars tests and Bonferroni correction. Intra-model stability was measured using the majority agreement ratio, and response time was recorded to assess real-time usability. ResultsJADE showed the highest diagnostic performance, correctly identifying 20 out of 25 cases and outperforming all standalone LLMs. Significant differences were observed across models (Cochrans Q = 33.2, df = 4, p < 0.001), with post-hoc analyses confirming that JADE significantly outperformed GPT-5, Gemini 2.5 Flash, and Claude Sonnet 4.5 (p < 0.01). JADE also exhibited the greatest run-to-run stability (mean MAR = 0.90 {+/-} 0.18). The average prediction time of 6 {+/-} 0.5 seconds supported its feasibility for real-time clinical use. ConclusionsJADE improved diagnostic accuracy and stability over standalone LLMs, underscoring the value of RAG reasoning in jawbone lesion assessment and its potential for real-time clinical use.