The expanding portfolio of targeted therapies for ulcerative colitis (UC) suggests that a more precise approach to defining disease activity will aid clinical decision-making. This prospective study used genome-wide microarrays to characterize gene expression in biopsies from the most inflamed colon segments from patients with UC and analyzed associations between molecular changes and short-term outcomes while on standard-of-care treatment. We analyzed 141 biopsies-128 biopsies from 112 UC patients and 13 biopsies from eight inflammatory bowel disease unclassified (IBDU) patients. Endoscopic disease was associated with expression of innate immunity transcripts, e.g. complement factor B (CFB)
inflammasome genes (ZBP1 and PIM2)
calprotectin (S100A8 and S100A9)
and inflammation-, injury-, and innate immunity-associated pathway analysis terms. A cross-validated molecular machine learning classifier trained on the endoscopic Mayo subscore predicted the endoscopic Mayo subscore with area-under-the-curve of 0.85. A molecular calprotectin transcript score showed strong associations with fecal calprotectin and the endoscopic Mayo subscore. Logistic regression models showed that molecular features (e.g. molecular classifier and molecular calprotectin scores) improved the prediction of disease progression over conventional, clinical features alone (e.g. total Mayo score, fecal calprotectin, physician global assessment). The molecular features of UC showed strong correlations with disease activity and permitted development of machine-learning predictive disease classifiers that can be applied to expanded testing in diverse cohorts.