Development and validation of an early diagnosis model for severe mycoplasma pneumonia in children based on interpretable machine learning.
Authors
Affiliations (4)
Affiliations (4)
- Department of Laboratory Medicine, Wuhan Children's Hospital (Wuhan Maternal and Child Healthcare Hospital), Tongji Medical College, Huazhong University of Science & Technology, Wuhan, 430016, China.
- Department of Laboratory Medicine, Wuhan Children's Hospital (Wuhan Maternal and Child Healthcare Hospital), Tongji Medical College, Huazhong University of Science & Technology, Wuhan, 430016, China. [email protected].
- Health Care Department, Wuhan Children's Hospital (Wuhan Maternal and Child Healthcare Hospital), Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430016, China. [email protected].
- Department of Laboratory Medicine, Wuhan Children's Hospital (Wuhan Maternal and Child Healthcare Hospital), Tongji Medical College, Huazhong University of Science & Technology, Wuhan, 430016, China. [email protected].
Abstract
Pneumonia is a major threat to the health of children, especially those under the age of five. Mycoplasma pneumoniae infection is a core cause of pediatric pneumonia, and the incidence of severe mycoplasma pneumoniae pneumonia (SMPP) has increased in recent years. Therefore, there is an urgent need to establish an early warning model for SMPP to improve the prognosis of pediatric pneumonia. The study comprised 597 SMPP patients aged between 1 month and 18 years. Clinical data were selected through Lasso regression analysis, followed by the application of eight machine learning algorithms to develop early warning model. The accuracy of the model was assessed using validation and prospective cohort. To facilitate clinical assessment, the study simplified the indicators and constructed visualized simplified model. The clinical applicability of the model was evaluated by DCA and CIC curve. After variable selection, eight machine learning models were developed using age, sex and 21 serum indicators identified as predictive factors for SMPP. A Light Gradient Boosting Machine (LightGBM) model demonstrated strong performance, achieving AUC of 0.92 for prospective validation. The SHAP analysis was utilized to screen advantageous variables, which contains of serum S100A8/A9, tracheal computed tomography (CT), retinol-binding protein(RBP), platelet larger cell ratio(P-LCR) and CD4+CD25+Treg cell counts, for constructing a simplified model (SCRPT) to improve clinical applicability. The SCRPT diagnostic model exhibited favorable diagnostic efficacy (AUC > 0.8). Additionally, the study found that S100A8/A9 outperformed clinical inflammatory markers can also differentiate the severity of MPP. The SCRPT model consisting of five dominant variables (S100A8/A9, CT, RBP, PLCR and Treg cell) screened based on eight machine learning is expected to be a tool for early diagnosis of SMPP. S100A8/A9 can also be used as a biomarker for validity differentiation of SMPP when medical conditions are limited.