Advances in next-generation sequencing technologies have enabled single genomes as well as complex environmental samples (metagenomes) to be comprehensively sequenced on a routine basis. Bioinformatics analysis of the resulting sequencing data reveals a continually expanding catalogue of predicted proteins ( 14 million as of April 2011), 75 percent of which are associated with functional annotation (COG, Pfam, Enzyme, Kegg, etc). These predicted proteins cover the full spectrum of known pathways and functional activities, including many novel biocatalysts that are expected to significantly contribute to the development of clean technologies including biomass degradation, lipid transformation for biodiesel generation, intermediates for polymer production, carbon capture, and bioremediation.