Introduction
Cervical cancer has long occupied a singular position in the history of women’s health. Before the modern era of pathology, it was viewed as the quintessential “woman’s cancer,” notable for its frequency and distressing clinical manifestations, including uncontrolled bleeding, pain, and eventual cachexia and death. This association shaped gendered perceptions of disease, with male physicians historically relegated to the care of women in the late stages of illness when interventions were largely palliative.1
Until the mid-19th century, the definition of cancer itself remained rudimentary, with the first written description of breast cancer found in Egyptian papyrus recordings in 3,000 BC. At that time and thereafter, cancer was understood as grave and incurable, either a “curse of the gods” or an imbalance of Hippocrates’ humors.2 Eventually, Virchow’s 1858 publication of Cellular Pathology laid the groundwork for understanding neoplasia as a cellular process. By the early 20th century, cervical carcinoma was microscopically defined, permitting the emergence of histologic categories such as carcinoma in situ.1,3
The evolution of cervical cancer prevention represents one of the most dramatic public-health successes of the twentieth century. With the advent of exfoliative cytology (the Pap smear), colposcopy, and later, excisional therapy, the incidence and mortality of cervical cancer in high-income countries fell by more than 80% over the past seven decades.4,5 These interventions transformed what had been a leading cause of death among reproductive-aged women into a largely preventable disease. However, global disparities persist: more than 85% of cases now occur in low- and middle-income countries.6,7
This article seeks to trace the trajectory of cervical-cancer control from its origins in microscopic observation to its current re-imagining through artificial intelligence (AI). The three pillars of prevention, i.e., cytologic screening, colposcopic diagnosis, and loop electrosurgical excision procedure (LEEP), illustrate a stage in the convergence of pathology, technology, and translational innovation. The narrative culminates in the emerging AI era, where digital cytology, computer-vision colposcopy, and intelligent, risk-stratified treatment systems are reshaping how clinicians screen, diagnose, and treat cervical disease. Together, these developments point toward a data-driven continuum of care capable of advancing global elimination targets of the World Health Organization (WHO).
The WHO 2020 Global Strategy to Accelerate the Elimination of Cervical Cancer as a Public Health Problem frames cervical cancer control as a coordinated continuum of evidence-based interventions across the life course.8 Within this framework, prevention and control can be understood as four interlocking layers, each essential to reducing incidence and mortality and explicitly described within the strategy’s intervention pathways.
Primary prevention focuses on human papillomavirus (HPV) vaccination, which WHO identifies as the most effective long-term intervention for reducing cervical cancer risk. Secondary prevention centers on screening for precancerous lesions using cytology, HPV testing, or visual inspection–based methods, coupled with timely treatment of detected disease. Tertiary prevention involves diagnostic evaluation, including colposcopy and histopathologic confirmation, which are required to guide appropriate management. Finally, definitive management encompasses treatment of precancerous lesions through ablative or excisional approaches, including thermal ablation, cryotherapy, and LEEP, as well as management of invasive disease when present.8
WHO emphasizes that these layers must be implemented in an integrated manner, noting that screening without access to diagnosis and treatment is unethical and ineffective, and that strengthening diagnostic and treatment capacity, including pathology and surgical services, is essential for successful prevention programs. This structured, life-course approach underpins the evolution of cervical cancer prevention strategies reviewed in this article, from cytology-based screening to contemporary diagnostic and treatment pathways aligned with global elimination efforts.8
This review synthesizes the evolution of cervical cancer screening, diagnostic evaluation, and selected treatment strategies through the framework of clinical and translational pathology, with primary emphasis on innovations in screening and diagnostic technologies. While excisional therapy such as LEEP remains a critical component of definitive management, the focus of this review centers on how cytologic, colposcopic, and digital diagnostic advances, particularly AI–enabled systems, are transforming the translational interface between pathology and clinical decision-making. By examining the application progress, translational pathways, and implementation challenges of AI across cervical cancer screening and diagnosis, this review aims to clarify its clinical value, regulatory considerations, and future directions within modern pathological practice.
Evolution of cervical cytology: From the Pap smear to AI-assisted digital screening
Origins and early adoption
The conceptual foundation for cytologic screening emerged from advances in pathology during the late 19th and early 20th centuries. Pathologists such as John Williams, Thomas Cullen, and Walter Schiller described pre-invasive lesions of the cervix and proposed the existence of a protracted intra-epithelial phase preceding carcinoma. These insights were the groundwork for George N. Papanicolaou’s revolutionary work. In 1928, Papanicolaou presented “New Cancer Diagnosis” at the Third Race Betterment Conference in Battle Creek, Michigan, after which the significance of his findings remained in obscurity for more than a decade.9 His description of malignant cells identifiable in vaginal fluid introduced a simple, reproducible method for detecting pre-cancerous change. Finally, with a 1941 co-publication with Herbert Traut, the “Pap smear” gained recognition, catalyzing organized screening programs across North America and Europe.3,9
By the 1950s–1960s, the Pap test had become a mainstay of women’s health services. Population-level programs led to dramatic reductions in cervical-cancer mortality, with North American data documenting declines exceeding 70–80%.10 The success of cytology established the principle that early cellular changes, rather than symptomatic disease, could guide effective prevention.
Modernization of cytologic methods
Technologic refinement began in the 1980s–1990s with the transition from conventional smears to liquid-based cytology such as ThinPrep® and SurePath®, which improved specimen adequacy, reduced obscuring artifacts, and facilitated automated and computer-assisted slide review.11 Concurrently, the introduction of the Bethesda System in 1988, most recently updated in its third edition in 2014, standardized cervical cytology reporting and established widely adopted diagnostic categories, including negative for intraepithelial lesion or malignancy, atypical squamous cells of undetermined significance (ASC-US), low-grade squamous intraepithelial lesion (LSIL), and high-grade squamous intraepithelial lesion (HSIL).12
The identification of HPV as the causal agent for cervical dysplasia by Harald zur Hausen in 1983 revolutionized screening philosophy.9 HPV DNA testing, first as a reflex test, then a co-test, and now as a primary screening modality, became central to contemporary algorithms. The American Cancer Society and American College of Obstetricians and Gynecologists recommend primary HPV testing every five years for women aged 25–65 years, with cytology or co-testing as acceptable alternatives.13–15
Recent work by the International Papillomavirus Society refines this understanding, emphasizing that disappearance of detectable HPV does not necessarily represent true viral clearance but rather immune control, with potential for redetection during immune suppression.16 This nuanced model of latency informs revised screening intervals and patient counseling worldwide. A visual representation of the evolution of cervical cancer screening can be seen in Figure 1.
AI in cytology
Automation of cytologic evaluation has evolved from rudimentary image-scanning devices to sophisticated neural-network systems. Early commercial tools like PAPNET, FocalPoint GS, and ThinPrep Imaging System implemented slide-based systems that maintain or improve sensitivity, with no adverse impact on specificity compared with manual primary screening.6 These systems analyze scanned images of slides and utilize machine-learning algorithms to identify areas of interest for the cytotechnologist or cytopathologist. It is then up to the cytotechnologist or cytopathologist to make the final diagnosis.17 Variations and improvements in these systems have primarily targeted streamlining workflow as a means of improving the efficiency of diagnosis.4
The latest generation of AI-enhanced whole-slide imaging integrates volumetric scanning and deep learning neural networks to identify abnormal cells across digitized specimens. Platforms such as Hologic Genius Digital Diagnostics, Techcyte SureView, BestCyte, and CytoSiA Pro demonstrate superior performance compared with manual microscopy.6 Multi-center studies have revealed that AI-assisted review can improve HSIL lesion detection while maintaining specificity comparable to expert cytotechnologists.5 Recent work documents significant time savings, increased concordance, and reduced interobserver variability.4,18
In the past few years, there have been many attempts at using deep learning and AI algorithms to make independent diagnoses across all fields of cytopathology. In the field of gynecological cytology alone, investigators have utilized these algorithms to better classify normal versus abnormal cells and nuclei across varying classification systems with great success.19 A striking example of this is a study that utilized a hybrid deep feature fusion algorithm to correctly classify slide images from two large public datasets across 2-, 3-, and 5-class classification schemes.20 By employing multiple classification layers to validate and test the model, the authors demonstrate the algorithm’s inherent robustness. Such multi-tiered validation suggests that these hybrid frameworks, with continued training and rigorous validation, will be instrumental in the future of diagnostic pathology.
To begin, several studies have been performed on binary classification of cells. In these studies, algorithms are used to classify “benign” vs. “malignant” or “normal” vs. “abnormal,” and then they are assessed for accuracy, sensitivity, and specificity in making diagnoses. Often, a receiver operating characteristic curve is used to generate an area-under-the-curve (AUC) to represent these outcomes.21–24 One study in particular utilized a new algorithm to classify images from three different datasets: single-cell images, multiple-cell images, and whole-slide images from Pap tests from a local pathology lab.25 In this study, the algorithm demonstrated an accuracy in detecting normal versus abnormal cells of 98.88%, 97.64%, and 96.80% for the three different datasets, respectively. The use of these varied data sources likely contributed to these robust results; the algorithm was trained in a stepwise manner to recognize abnormalities, a process mirroring the pedagogical progression of human pathology trainees.
Since classifying cells as “normal” or “abnormal” is the goal of current commercially available screening systems, the majority of studies reviewed sought to employ AI or deep learning algorithms to make more nuanced and multi-faceted diagnoses.20,23,26–35 One study in particular sought to classify images based on seven diagnoses: superficial squamous epithelium (normal), intermediate squamous epithelium (normal), columnar epithelial (normal), mild squamous non-keratinizing dysplasia (abnormal), moderate squamous non-keratinizing dysplasia (abnormal), severe squamous non-keratinizing dysplasia (abnormal), and squamous cell carcinoma in situ intermediate (abnormal).36 In this study, they found that based on features found in the whole cell, the algorithm successfully classified 199 of 200 squamous cell carcinoma cases, with the misclassified case being assigned severe dysplasia.
One distinct study explored the utility of a deep learning algorithm designed not for classification, but for semantic segmentation—the detection and enhancement of nuclear features within the overlapping cell fields of a Pap smear. When augmented with additional filters, this framework achieved 93% accuracy in segmenting cervical nuclei.37 While this task may appear less direct than diagnostic classification, successful semantic segmentation represents a critical evolution in the field. It moves beyond binary “normal” versus “abnormal” labels, enabling algorithms to recognize the specific morphological nuances essential for automated, high-fidelity diagnostic assessments.
Beyond commercial platforms, multiple deep-learning models trained on whole-slide liquid-based cytology images have demonstrated strong diagnostic performance, with reported AUCs approaching 0.9–0.96 and high sensitivity and specificity for neoplastic cell detection across large validation cohorts.24 These systems generate probability heatmaps that localize abnormal cells within digitized slides, supporting both diagnostic classification and interpretability. At the population level, a recent systematic review and meta-analysis encompassing more than 280,000 cervical cytology tests reported pooled sensitivities and specificities exceeding 90% across diverse geographic and laboratory settings, including both conventional Pap smears and ThinPrep cytology preparations.38 This particular study serves as a prime example of how beneficial these algorithms can be on a population level and how useful they can be to both individual clinicians and epidemiologists alike.
These systems also carry transformative implications for health equity. Cloud-based AI platforms allow digitized slide review across geographic boundaries, addressing workforce shortages and enabling remote screening in low-resource regions.6 While challenges such as data bias, regulatory validation, and implementation cost remain, the trajectory suggests that AI-assisted cytology is poised to become the next inflection point in cervical cancer prevention.39
Challenges and future directions in AI-assisted cytology
Despite rapid advances in deep learning and digital pathology, several challenges remain before AI-assisted cytology can be fully integrated into routine clinical practice. A central concern is data bias and representativeness, as many AI models are trained on slides derived from specific geographic regions, staining protocols, scanners, or preparation types. Such limitations may reduce generalizability when algorithms are applied across diverse populations and laboratory environments, underscoring the need for larger, more heterogeneous training datasets and robust external validation.6,24,38
Two more technical forms of bias seen in all emerging algorithms are “class imbalance” and “overfitting.” Class imbalance occurs when an algorithm is trained on a dataset that is skewed toward “abnormal” classifications. This biases the algorithm toward classifying any new data as abnormal, which muddies the generalizability of the algorithm for widespread use.40 Similarly, overfitting is a phenomenon that occurs when a new algorithm is trained too extensively on a given dataset or is trained on only one dataset. When images from these datasets are used to validate the algorithm, it will perform well. However, if the algorithm is validated against a new dataset outside that on which it was trained, its performance may decline when applied to this external dataset.41 These forms of bias and limitations of these new algorithms are already being considered in allied fields of medical practice where deep learning algorithms are beginning to be widely used.42
Regulatory validation represents an additional barrier to widespread adoption. Translation from experimental performance to clinical deployment requires rigorous multi-site evaluation, demonstration of diagnostic safety and reliability in real-world settings, and ongoing post-deployment performance monitoring, particularly as algorithms evolve over time.6,18 Expert task-force reviews further emphasize that the current evidence base for AI use in routine cytology practice remains limited, highlighting gaps between proof-of-concept studies and real-world clinical implementation.43 Applying lessons learned from allied fields of medical practice (i.e., radiology), where the use of deep learning algorithms is already more widespread, has helped to shape the legal and ethical recommendations for implementation of these algorithms in pathology.44
Cost and infrastructure requirements also pose challenges. High-resolution whole-slide scanners, data storage capacity, network bandwidth, and system maintenance represent substantial upfront investments for laboratories, although these costs are expected to decrease as digital pathology platforms become more widely adopted and standardized.6 In parallel, workflow integration remains a critical consideration. Successful implementation depends on seamless interfacing with laboratory information systems, quality assurance frameworks, and cytology training programs, ensuring that AI functions as a decision-support tool that complements, rather than replaces, expert human interpretation.4,6,45 These tools require continuous oversight and iterative performance evaluation. Just as a pathologist must constantly work to stay up to date on the most recent advances and changes in the field, so too must these algorithms. Because all of these algorithms are neural networks based on an incomplete understanding of the neural networks in a human brain, humans will be required to interface with these algorithms for the foreseeable future. They will be required to constantly train and retrain the algorithms with new and more diverse datasets from larger populations. Internal quality assurance protocols will need to be developed to ensure continued accuracy of algorithm-derived diagnoses and to ensure that the algorithm has not become biased. Ultimately, until a consensus can be reached among medical, legal, and legislative bodies, the pathologist remains the final authority and holds primary accountability for all diagnoses, whether assisted or rendered by an algorithm.
Looking forward, addressing these challenges will require coordinated efforts across technical development, regulatory science, and implementation research. While current evidence supports the diagnostic potential of AI-assisted cytology, prospective studies, real-world performance data, and standardized practice guidelines will be essential to enable safe, equitable, and sustainable integration into cervical cancer screening programs worldwide.24,38,43,45
The evolution of colposcopy: From optical innovation to computer vision
Historical foundations
Hans Hinselmann, a German gynecologist, developed the first optical colposcope in 1924 and published his findings the following year, coining the term colposcopy.46,47 He aimed to provide magnified visualization of cervical epithelium to identify precancerous lesions invisible to the naked eye. Early adoption was limited by cost, technical complexity, and the need for specialized training. By the 1950s–1960s, colposcopy had become a critical adjunct to cytology, guiding biopsy and confirming histologic diagnosis.47
Standardization and technologic advances
Standardization in the 1970s and 1980s through scoring systems such as the Reid Colposcopic Index and Swede Score improved reproducibility.48 The subsequent transition to digital and video colposcopy allowed for image storage, teleconsultation, and educational standardization.46 With the advent of organized screening, colposcopy became the central bridge between cytologic abnormality and treatment.13,15
AI-enhanced colposcopic evaluation
Deep-learning algorithms now provide automated image interpretation that can supplement—or, in low-resource areas, replace—expert review. Automated visual evaluation (AVE), a convolutional-neural-network classifier trained on tens of thousands of cervical images, distinguishes normal, indeterminate, and precancer/cancer categories.7,49 In Zambia, a combined HPV genotyping + AVE strategy demonstrated an AUC of 0.91, with sensitivity and specificity of 85% and 86% for CIN2+ detection, respectively.7 Diagnostic performance remained high among women living with HIV.7 Smartphone-based systems such as the EVA System (MobileODT) extend this approach to portable, low-cost screening.7,49
Algorithmic performance in AI-assisted colposcopy is strongly influenced by image quality, including illumination, focus, magnification, and visualization of the transformation zone. Variability in camera hardware, lighting conditions, cervical preparation, and operator technique can affect diagnostic accuracy, particularly when models trained on standardized datasets are deployed across heterogeneous clinical environments. These challenges are amplified in smartphone-based systems, underscoring the need for standardized imaging protocols, quality-control mechanisms, and preprocessing strategies to ensure reliable interpretation across settings.7,49
Representation of diverse patient populations is another critical consideration. Cervical appearance varies with age, hormonal status, parity, prior treatment, and comorbid conditions such as HIV, and algorithms trained on limited demographic or geographic cohorts may demonstrate reduced generalizability. Large-scale validation studies emphasize the importance of training and testing AI systems across diverse populations, including women living with HIV, in whom inflammatory and vascular changes may alter colposcopic appearance.6,49
Regulatory pathways for AI-based colposcopy are actively evolving, with increasing emphasis on safety, reproducibility, and continuous performance monitoring. Unlike static diagnostic devices, AI systems require ongoing evaluation to detect performance drift as imaging hardware, clinical practices, and patient populations change. At present, AI-assisted colposcopic tools such as AVE have demonstrated strong diagnostic performance in research and programmatic settings but have not yet received approval from the U.S. Food and Drug Administration for routine clinical use, and they are primarily deployed within pilot programs, implementation studies, or global health initiatives rather than as widely marketed commercial diagnostic devices.6,7,49
Despite these challenges, AI-assisted colposcopy represents a significant advance toward more reproducible and equitable diagnostic access. Continued progress will depend on inclusive training datasets, standardized imaging practices, prospective clinical validation, and clearly defined regulatory frameworks to support safe and effective deployment across diverse health-care settings.
Evolution of excisional treatment: LEEP
Historical context
Early treatment of precancerous lesions relied on cold-knife conization and hysterectomy, both associated with high morbidity.50 By the 1970s, ablative methods such as electrocautery, cryotherapy, and laser ablation provided outpatient alternatives but raised concerns about under-treatment of occult invasion.
LEEP, introduced in the 1980s, offered precise excision with hemostasis and tissue retrieval for histologic assessment, quickly becoming standard care for CIN 2–3.13,14
Clinical role and comparative outcomes
Both ablative and excisional modalities are effective, yet excision confers greater diagnostic certainty at the cost of higher obstetric risk. One meta-analysis found that the relative risk of preterm birth after LEEP is approximately 1.7, compared with 2.6 after cold-knife conization; ablative methods carry no increased risk.50 Most lesions extend <5 mm, supporting limited-depth excisions of 6–7 mm to minimize complications.50
Current guidelines emphasize individualized, fertility-preserving management and observation in select young women with CIN2.14,15
Digital and AI-guided therapy: Future directions
While LEEP remains a manual procedure, advances in digital pathology are beginning to outline potential future roles for AI-supported treatment planning. One example is a deep-learning framework that performs automated segmentation of HSIL and squamous cell carcinoma on whole-slide Pap smear images and is explicitly proposed as a “diagnosis and treatment planning” tool, demonstrating that AI can localize and quantify high-grade disease on digitized cytology specimens.51
Concurrently, modern AI-assisted cytology platforms, including Genius Digital Diagnostics, BestCyte, Techcyte SureView, and CytoSiA Pro, show improved HSIL detection, increased diagnostic concordance, and greater efficiency.4,6,18 AI-supported colposcopic tools such as AVE also enhance triage, achieving AUC values around 0.9 and strong performance across diverse populations.7,49
Together, these systems create a digitally enriched diagnostic pathway in which LEEP decisions increasingly draw on objective, machine-generated risk estimates. Although AI-directed LEEP or real-time excision guidance is not yet available, current evidence indicates that AI-derived lesion quantification and HPV/AI-based risk scores could eventually inform excision depth, treatment selection, and follow-up.49,51 These applications remain future directions requiring prospective validation.