BACKGROUND: Pathogen whole genome sequencing (WGS) has significant potential for improving healthcare-associated infection (HAI) outcomes. However, methods for integrating WGS with epidemiologic data to quantify risks for pathogen spread remain underdeveloped. METHODS: To identify analytic strategies for conducting WGS-based HAI surveillance in high-burden settings, we modeled patient- and facility-level transmission risks of carbapenem-resistant RESULTS: Genomic relatedness between pairs of isolates was associated with room sharing in two of the three models and overlapping stays on a high-acuity unit in all models, echoing previous findings from LTACH settings. In our sensitivity analysis, qualitative findings were robust to the exclusion of cases that would not have been identified with a passive surveillance strategy, however uncertainty in all estimates also increased markedly. CONCLUSIONS: Taken together, our results demonstrate that pairwise regression models combining relevant genomic and epidemiologic data are useful tools for identifying HAI transmission risks. KEY MESSAGES: Whole genome sequencing of healthcare associated infections (HAI) is becoming more common and new methods are necessary to integrate these data with epidemiologic risk factors to quantify transmission drivers. We demonstrate how pairwise regression models, in which the outcome of a regression model represents genomic similarity between a pair of isolates, can identify known transmission risk factors of carbapenem-resistant