PURPOSE: Prospective cohort development in low-resource settings may be limited by cancer registry population coverage
however, information routinely collected in health systems may offer opportunities to advance cancer research. We aim to illustrate in a cohort study in Mexico, a cancer ascertainment strategy that integrates multiple sources of information including healthcare utilization databases. METHODS: The Mexican Teachers' Cohort (MTC) includes 114,545 female teachers aged 25 years and older who completed a baseline questionnaire between 2006 and 2010 and were breast cancer free. We used healthcare utilization databases (including electronic health records), self-reported breast cancer, mortality, and cancer registries to identify women with incident breast cancer. We estimated the positive predictive value for self-reported breast cancer and age-specific and age-standardized incidence rates for breast cancer and corresponding 95% confidence intervals (95%CI) calculating person-time from the date of baseline questionnaire response to diagnosis, death, or December 31, 2019. RESULTS: Between baseline and 2019, we identified 1,313 women with incident breast cancer. We established the diagnosis in 88% using healthcare utilization databases, 6% using cancer and mortality registries, and 6% directly by contacting participants. The positive predictive value of self-reported diagnosed and treated breast cancer was 94% (95%CI 91, 97). The age-standardized incidence was 77.0 per 100,000 person-years (95%CI 75.9, 84.3). The highest incidence was observed in women aged 65-69 years (185.3 per 100,000 person-years). CONCLUSION: Leveraging healthcare utilization databases to establish cancer diagnoses within prospective cohorts may offer an opportunity to advance global cancer research.