Dual-threshold sample selection with latent tendency difference for label-noise-robust pneumoconiosis staging.
Authors
Affiliations (5)
Affiliations (5)
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, China.
- School of Software, North University of China, Taiyuan, China.
- School of Software, Taiyuan University of Technology, Taiyuan, China.
- College of Information, Jinzhong College of Information, Jinzhong, China.
- First Hospital of Shanxi Medical University, Taiyuan, China.
Abstract
BackgroundThe precise pneumoconiosis staging suffers from progressive pair label noise (PPLN) in chest X-ray datasets, because adjacent stages are confused due to unidentifialble and diffuse opacities in the lung fields. As deep neural networks are employed to aid the disease staging, the performance is degraded under such label noise.ObjectiveThis study improves the effectiveness of pneumoconiosis staging by mitigating the impact of PPLN through network architecture refinement and sample selection mechanism adjustment.MethodsWe propose a novel multi-branch architecture that incorporates the dual-threshold sample selection. Several auxiliary branches are integrated in a two-phase module to learn and predict the <i>progressive feature tendency</i>. A novel difference-based metric is introduced to iteratively obtained the instance-specific thresholds as a complementary criterion of dynamic sample selection. All the samples are finally partitioned into <i>clean</i> and <i>hard</i> sets according to dual-threshold criteria and treated differently by loss functions with penalty terms.ResultsCompared with the state-of-the-art, the proposed method obtains the best metrics (accuracy: 90.92%, precision: 84.25%, sensitivity: 81.11%, F1-score: 82.06%, and AUC: 94.64%) under real-world PPLN, and is less sensitive to the rise of synthetic PPLN rate. An ablation study validates the respective contributions of critical modules and demonstrates how variations of essential hyperparameters affect model performance.ConclusionsThe proposed method achieves substantial effectiveness and robustness against PPLN in pneumoconiosis dataset, and can further assist physicians in diagnosing the disease with a higher accuracy and confidence.