MUBen: Benchmarking the Uncertainty of Pre-Trained Models for Molecular
  Property Prediction

Du, Yuanqi; Kong, Lingkai; Li, Yinghao; Mu, Wenhao; Yu, Yue; Zhang, Chao; Zhuang, Yuchen

MUBen: Benchmarking the Uncertainty of Pre-Trained Models for Molecular Property Prediction

Authors: Yuanqi Du
Lingkai Kong
Yinghao Li
Wenhao Mu
Yue Yu
Chao Zhang
Yuchen Zhuang
Publication date: 14 June 2023
Publisher

Abstract

Large Transformer models pre-trained on massive unlabeled molecular data have shown great success in predicting molecular properties. However, these models can be prone to overfitting during fine-tuning, resulting in over-confident predictions on test data that fall outside of the training distribution. To address this issue, uncertainty quantification (UQ) methods can be used to improve the models' calibration of predictions. Although many UQ approaches exist, not all of them lead to improved performance. While some studies have used UQ to improve molecular pre-trained models, the process of selecting suitable backbone and UQ methods for reliable molecular uncertainty estimation remains underexplored. To address this gap, we present MUBen, which evaluates different combinations of backbone and UQ models to quantify their performance for both property prediction and uncertainty estimation. By fine-tuning various backbone molecular representation models using different molecular descriptors as inputs with UQ methods from different categories, we critically assess the influence of architectural decisions and training strategies. Our study offers insights for selecting UQ and backbone models, which can facilitate research on uncertainty-critical applications in fields such as materials science and drug discovery

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2306.10060

Last time updated on 22/06/2023