Dental Odontogenic Lesion CBCT and Histopathology Integrated Dataset for Benchmarking Deep Learning Algorithms.
Authors
Affiliations (7)
Affiliations (7)
- School of Computer Science, The University of Sydney, Sydney, NSW, Australia.
- Department of Oral & Maxillofacial Head Neck Oncology, School & Hospital of Stomatology, Wuhan University, Wuhan, China.
- State Key Laboratory of Oral & Maxillofacial Reconstruction and Regeneration, Key Laboratory of Oral Biomedicine, Ministry of Education, Hubei Key Laboratory of Stomatology, School & Hospital of Stomatology, Wuhan University, Wuhan, China.
- State Key Laboratory of Oral & Maxillofacial Reconstruction and Regeneration, Key Laboratory of Oral Biomedicine, Ministry of Education, Hubei Key Laboratory of Stomatology, School & Hospital of Stomatology, Wuhan University, Wuhan, China. [email protected].
- Department of Orthodontics, School & Hospital of Stomatology, Wuhan University, Wuhan, China. [email protected].
- Institute of Translational Medicine, Shanghai Jiao Tong University, Shanghai, China.
- School of Computer Science, The University of Sydney, Sydney, NSW, Australia. [email protected].
Abstract
Accurate diagnosis of odontogenic lesions requires pre-operative cone-beam computed tomography (CBCT) and post-operative histopathological confirmation, a workflow that is time-consuming and reliant on clinical expertise. With the rise of artificial intelligence (AI) and deep learning, automated diagnostic solutions have shown great promise. However, progress in deep learning for odontogenic lesions has been hindered by the lack of publicly available paired datasets that combine radiological and histopathological data. To address this gap, we present the Dental Odontogenic Lesion CBCT and Histopathology Integrated Dataset (DOLCHID), comprising 262 paired CBCT scans and H&E-stained histopathology images. The dataset includes four major lesion subtypes - dentigerous cyst (n = 44), radicular cyst (n = 54), odontogenic keratocyst (n = 92), and ameloblastoma (n = 72), each paired with expert-verified CBCT segmentation masks and annotated histopathological regions of interest (ROI). We also provide technical validations for lesion segmentation, single modality classification, and multimodal classification, which demonstrate the utility of our dataset. DOLCHID is expected to advance deep learning research in dental imaging by enabling integrative diagnostic modelling that leverages complementary radiological and histopathological information.