BACKGROUND: Preterm birth (PTB) is the leading cause of infant mortality. Risk for PTB is influenced by multiple biological pathways, many of which are poorly understood. Some PTBs result from medically indicated labor following complications from hypertension and/or diabetes, while many others are spontaneous with unknown causes. Previously, investigation of potential risk factors has been limited by a lack of data on maternal medical history and the difficulty of classifying PTBs as indicated or spontaneous. Here, we leverage electronic health record (EHR) data (patient health information including demographics, diagnoses, and medications) and a supplemental curated pregnancy database to overcome these limitations. Novel associations may provide new insight into the pathophysiology of PTB as well as help identify individuals who would be at risk of PTB. METHODS: We quantified associations between maternal diagnoses and preterm birth both with and without controlling for maternal age and socioeconomic factors within a University of California, San Francisco (UCSF), EHR cohort with 10,643 births (n RESULTS: Thirty diagnoses significantly and robustly (False Discovery Rate (FDR) <
0.05) associated with indicated PTBs compared to term. We discovered known (hypertension, diabetes, and chronic kidney disease) and less established (blood, cardiac, gynecological, and liver diagnoses) associations. Essential hypertension had the most significant association with indicated PTB (adjusted p CONCLUSIONS: Our study underscores the limitations of approaches that combine indicated and spontaneous births. When combined, significant associations were almost entirely driven by indicated PTBs, although the spontaneous and indicated groups were of a similar size. Investigating the spontaneous population has the potential to reveal new pathways and understanding of the heterogeneity of PTB.