Fuzz Testing Molecular Representation Using Deep Variational Anomaly Generation.

 0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Rafael V C Guido, Michael J Keiser, Victor H R Nogueira, Rishabh Sharma

Ngôn ngữ: eng

Ký hiệu phân loại: 371.192 Parent-school relations

Thông tin xuất bản: United States : Journal of chemical information and modeling , 2025

Mô tả vật lý:

Bộ sưu tập: NCBI

ID: 642422

Researchers are developing increasingly robust molecular representations, motivating the need for thorough methods to stress-test and validate them. Here, we use a variational auto-encoder (VAE), an unsupervised deep learning model, to generate anomalous examples of SELF-referencIng Embedded Strings (SELFIES), a popular molecular string format. These anomalies defy the assertion that all SELFIES convert into valid SMILES strings. Interestingly, we find specific regions within the VAE's internal landscape (latent space), whose decoding frequently generates inconvertible SELFIES anomalies. The model's internal landscape self-organization helps with exploring factors affecting molecular representation reliability. We show how VAEs and similar anomaly generation methods can empirically stress-test molecular representation robustness. Additionally, we investigate reasons for the invalidity of some discovered SELFIES strings (version 2.1.1) and suggest changes to improve them, aiming to spark ongoing molecular representation improvement.
Tạo bộ sưu tập với mã QR

THƯ VIỆN - TRƯỜNG ĐẠI HỌC CÔNG NGHỆ TP.HCM

ĐT: (028) 36225755 | Email: tt.thuvien@hutech.edu.vn

Copyright @2024 THƯ VIỆN HUTECH