An open bone marrow megakaryocyte dataset for automated morphologic studies.
Authors
Affiliations (9)
Affiliations (9)
- School of Software Engineering, Xinjiang University, Urumqi, China.
- Department of Hematology, Xiangya Hospital, Central South University, Changsha, China. [email protected].
- National Clinical Research Center for Geriatric Diseases, Xiangya Hospital, Changsha, China. [email protected].
- Hunan Hematology Oncology Clinical Medical Research Center, Changsha, China. [email protected].
- School of Software Engineering, Xi'an Jiaotong University, Xi'an, China.
- School of Computer Science, Wuhan University, Wuhan, China.
- Department of Hematology, Xiangya Hospital, Central South University, Changsha, China. [email protected].
- National Clinical Research Center for Geriatric Diseases, Xiangya Hospital, Changsha, China. [email protected].
- Hunan Hematology Oncology Clinical Medical Research Center, Changsha, China. [email protected].
Abstract
Precise classification of megakaryocyte subtypes in bone marrow examination is crucial for the diagnosis and research of various hematological disorders, including Myelodysplastic Syndromes (MDS) and other platelet-production related diseases. While deep learning (DL) has demonstrated remarkable success in medical image classification, its application to megakaryocyte classification has been hindered by the scarcity of high-quality, openly licensed datasets. Therefore, we present MK-11, a dataset comprising 7,204 Wright-Giemsa stained single-cell images across 11 clinically relevant megakaryocyte subtypes. All images were annotated by two experienced hematopathologists with consensus review to ensure annotation quality, following standardized diagnostic criteria. Several state-of-the-art neural networks, including convolutional and transformer-based models, were evaluated on this benchmark, establishing strong baseline performance for megakaryocyte classification. To ensure reproducibility, we provide standardized five-fold cross-validation partitions along with all original images, annotations, partitioning schemes, and evaluation scripts under open licenses. In conclusion, this work presents the first public megakaryocyte subtype classification dataset for automatic morphological assessment development and evaluation, serving as a benchmark for future research.