Robustly interrogating machine learning-based scoring functions: what are they learning?

0 Người đánh giá. Xếp hạng trung bình 0

Tác giả: Kristian Birchall, Fergus Boyles, Charlotte M Deane, Guy Durant, Brian Marsden

Ngôn ngữ: eng

Ký hiệu phân loại: 369.181 Spanish-American War, 1898

Thông tin xuất bản: England : Bioinformatics (Oxford, England) , 2025

Mô tả vật lý:

Bộ sưu tập: NCBI

ID: 61745

Thêm vào giỏ Liên kết toàn văn

MOTIVATION: Machine learning-based scoring functions (MLBSFs) have been found to exhibit inconsistent performance on different benchmarks and be prone to learning dataset bias. For the field to develop MLBSFs that learn a generalizable understanding of physics, a more rigorous understanding of how they perform is required. RESULTS: In this work, we compared the performance of a diverse set of popular MLBSFs (RFScore, SIGN, OnionNet-2, Pafnucy, and PointVS) to our proposed baseline models that can only learn dataset biases on a range of benchmarks. We found that these baseline models were competitive in accuracy to these MLBSFs in almost all proposed benchmarks, indicating these models only learn dataset biases. Our tests and provided platform, ToolBoxSF, will enable researchers to robustly interrogate MLBSF performance and determine the effect of dataset biases on their predictions. AVAILABILITY AND IMPLEMENTATION: https://github.com/guydurant/toolboxsf.

Tạo bộ sưu tập với mã QR