Simulating 500 million years of evolution with a language model.

 0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Halil Akin, Rohil Badkundri, Liam J Bartie, Salvatore Candido, Jonathan Deaton, Alexander Derry, Jun Gong, Thomas Hayes, Patrick D Hsu, Yousuf A Khan, Carolyn Kim, Zeming Lin, Chetan Mishra, Raul S Molina, Matthew Nemeth, Deniz Oktay, Roshan Rao, Alexander Rives, Tom Sercu, Irhum Shafkat, Nicholas J Sofroniew, Neil Thomas, Vincent Q Tran, Robert Verkuil, Marius Wiggert

Ngôn ngữ: eng

Ký hiệu phân loại: 322.5 Armed services

Thông tin xuất bản: United States : Science (New York, N.Y.) , 2025

Mô tả vật lý:

Bộ sưu tập: NCBI

ID: 217304

More than 3 billion years of evolution have produced an image of biology encoded into the space of natural proteins. Here, we show that language models trained at scale on evolutionary data can generate functional proteins that are far away from known proteins. We present ESM3, a frontier multimodal generative language model that reasons over the sequence, structure, and function of proteins. ESM3 can follow complex prompts combining its modalities and is highly responsive to alignment to improve its fidelity. We have prompted ESM3 to generate fluorescent proteins. Among the generations that we synthesized, we found a bright fluorescent protein at a far distance (58% sequence identity) from known fluorescent proteins, which we estimate is equivalent to simulating 500 million years of evolution.
Tạo bộ sưu tập với mã QR

THƯ VIỆN - TRƯỜNG ĐẠI HỌC CÔNG NGHỆ TP.HCM

ĐT: (028) 36225755 | Email: tt.thuvien@hutech.edu.vn

Copyright @2024 THƯ VIỆN HUTECH