Back to all papers

Deep visual detection system for oral squamous cell carcinoma.

January 19, 2026pubmed logopapers

Authors

Akram K,Aslam M,Waheed T,Ayesha N,Alamri FS,Mirdad AR,Rehman A

Affiliations (4)

  • Department of Computer Science, University of Engineering and Technology, Lahore, 54000, Pakistan.
  • Center of Excellence in Cyber Security (CYBEX), Prince Sultan University, 11586, Riyadh, Saudi Arabia.
  • Department of Mathematical Sciences, College of Science, Princess Nourah Bint Abdulrahman University, P.O.Box 84428, 11671, Riyadh, Saudi Arabia. [email protected].
  • Artificial Intelligence & Data Analytics Lab (AIDA) CCIS, Prince Sultan University, 11586, Riyadh, Saudi Arabia.

Abstract

Oral Squamous Cell Carcinoma (OSCC) is a widespread and aggressive malignancy where early and accurate detection is essential for improving patient outcomes. Traditional diagnostic methods relying on histopathological examination are often time-consuming, resource-intensive, and susceptible to subjective interpretation. Moreover, inter-observer variability can further compromise diagnostic consistency, leading to delays in timely intervention. In recent years, advances in Artificial Intelligence (AI) and computer-aided diagnostic systems have shown transformative potential in medical imaging, enabling faster, objective, and reproducible detection of complex disease patterns. Particularly, deep learning-based models have demonstrated remarkable accuracy in histopathological analysis, making them promising tools for OSCC diagnosis and early clinical decision-making. This study introduces a Deep Visual Detection System (DVDS) designed to automate OSCC detection using histopathological images. Three convolutional neural network (CNN) models-EfficientNetB3, DenseNet121, and ResNet50-were trained and evaluated on two publicly available datasets: the Kaggle Oral Cancer Detection dataset containing 5192 images labeled as Normal or OSCC, and the NDB-UFES dataset comprising 3763 images categorized into OSCC, leukoplakia with dysplasia, and leukoplakia without dysplasia. Data augmentation techniques were employed to mitigate class imbalance and enhance model generalization, while advanced image preprocessing methods and training strategies such as EarlyStopping and ReduceLROnPlateau were applied to ensure stable convergence. Results Among the models tested, EfficientNetB3 consistently delivered superior performance across both datasets. On the binary classification task, it achieved a test accuracy of 97.05%, with precision, recall, and F1-score all at 97.05%, specificity of 97.17%, and sensitivity of 96.92%. On the multi-class NDB-UFES dataset, it again outperformed the other models, attaining a 97.16% accuracy, matching precision, recall, and F1-score, and specificity of 98.58%. In contrast, DenseNet121 and ResNet50 showed substantially lower accuracy scores in both experiments. These results highlight the importance of model architecture and preprocessing in medical image classification tasks. The proposed Deep Visual Detection System (DVDS), built upon EfficientNetB3, demonstrates high reliability and robustness, suggesting strong potential for deployment in clinical settings to aid pathologists in rapid and consistent OSCC diagnosis. This approach could significantly streamline diagnostic workflows and support early intervention strategies, ultimately enhancing patient care.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 9,300+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.