We develop a method for assigning high-quality labels to unstructured text. This method is based on fine-tuning an efficient, open-source language model with data extracted from a large, proprietary language model. We apply this method to construct a census of published clinical trials. With these data, we revisit a literature that contends that pharmaceutical sector productivity is declining. Central to this conclusion are measurements of substantial increases in the quantity of clinical trials over time, unmatched by trends in measures of output. In our data, the quantity, quality, and composition of clinical trials are stable since 2010. We show that previous measurements are an artifact of biases introduced by shifts in the composition of other forms of research.