Natural language processing evaluation of trends in cervical cancer incidence in radiology reports: A ten-year survey.

Authors

López-Úbeda P,Martín-Noguerol T,Luna A

Affiliations (2)

  • NLP Department, HT Medica, Carmelo Torres nº2, 23007, Jaén, Spain. Electronic address: [email protected].
  • MRI Unit, Radiology Department, HT Medica, Carmelo Torres nº2, 23007, Jaén, Spain.

Abstract

Cervical cancer commonly associated with human papillomavirus (HPV) infection, remains the fourth most common cancer in women globally. This study aims to develop and evaluate a Natural Language Processing (NLP) system to identify and analyze cervical cancer incidence trends from 2013 to 2023 at our institution, focusing on age-specific variations and evaluating the possible impact of HPV vaccination. This retrospective cohort study, we analyzed unstructured radiology reports collected between 2013 and 2023, comprising 433,207 studies involving 250,181 women who underwent CT, MRI, or ultrasound scans of the abdominopelvic region. A rule-based NLP system was developed to extract references to cervical cancer from these reports and validated against a set of 200 manually annotated cases reviewed by an experienced radiologist. The NLP system demonstrated excellent performance, achieving an accuracy of over 99.5 %. This high reliability enabled its application in a large-scale population study. Results show that the women under 30 maintain a consistently low cervical cancer incidence, likely reflecting early HPV vaccination impact. The 30-40 cohorts declined until 2020, followed by a slight increase, while the 40-60 groups exhibited an overall downward trend with fluctuations, suggesting long-term vaccine effects. Incidence in patients over 60 also declined, though with greater variability, possibly due to other risk factors. The developed NLP system effectively identified cervical cancer cases from unstructured radiology reports, facilitating an accurate analysis of the impact of HPV vaccination on cervical cancer prevalence and imaging study requirements. This approach demonstrates the potential of AI and NLP tools in enhancing data accuracy and efficiency in medical epidemiology research. NLP-based approaches can significantly improve the collection and analysis of epidemiological data on cervical cancer, supporting the development of more targeted and personalized prevention strategies-particularly in populations with heterogeneous HPV vaccination coverage.

Topics

Journal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.