Automatic head and neck tumor segmentation through deep learning and Bayesian optimization on three-dimensional medical images.
Authors
Affiliations (3)
Affiliations (3)
- Department of Industrial and Systems Engineering, Mississippi State University, Mississippi State, MS 39762, USA.
- Department of Radiation Oncology, University of Mississippi Medical Center, Jackson, MS 39216, USA.
- Department of Industrial and Systems Engineering, Mississippi State University, Mississippi State, MS 39762, USA. Electronic address: [email protected].
Abstract
Medical imaging constitutes critical information in the diagnostic and prognostic evaluation of patients, as it serves to uncover a broad spectrum of pathologies and deviances. Clinical practitioners who carry out medical image screening are primarily reliant on their knowledge and experience for disease diagnosis. Convolutional Neural Networks (CNNs) hold the potential to serve as a formidable decision-support tool in the realm of medical image analysis due to their high capacity to extract hierarchical features and effectuate direct classification and segmentation from image data. However, CNNs contain a myriad of hyperparameters and optimizing these hyperparameters poses a major obstacle to the effective implementation of CNNs. In this work, a two-phase Bayesian Optimization-derived Scheduling (BOS) approach is proposed for hyperparameter optimization for the head and cancerous tissue segmentation tasks. We proposed this two-phase BOS approach to incorporate both rapid convergences in the first training phase and slower (but without overfitting) improvements in the last training phase. Furthermore, we found that batch size and learning rate have a significant impact on the training process, but optimizing them separately can lead to sub-optimal hyperparameter combinations. Therefore, batch size and learning rate have been coupled as the batch size to learning rate (B2L) ratio and utilized in the optimization process to optimize both simultaneously. The optimized hyperparameters have been tested for a three-dimensional V-Net model with computed tomography (CT) and positron emission tomography (PET) scans to segment and classify cancerous and noncancerous tissues. The results of 10-fold cross-validation indicate that the optimal batch size to learning rate (B2L) ratio for each phase of the training method can improve the overall medical image segmentation performance.