We are now witnessing significant progress of deep learning methods in a
variety of tasks (or datasets) of proteins. However, there is a lack of a
standard benchmark to evaluate the performance of different methods, which
hinders the progress of deep learning in this field. In this paper, we propose
such a benchmark called PEER, a comprehensive and multi-task benchmark for
Protein sEquence undERstanding. PEER provides a set of diverse protein
understanding tasks including protein function prediction, protein localization
prediction, protein structure prediction, protein-protein interaction
prediction, and protein-ligand interaction prediction. We evaluate different
types of sequence-based methods for each task including traditional feature
engineering approaches, different sequence encoding methods as well as
large-scale pre-trained protein language models. In addition, we also
investigate the performance of these methods under the multi-task learning
setting. Experimental results show that large-scale pre-trained protein
language models achieve the best performance for most individual tasks, and
jointly training multiple tasks further boosts the performance. The datasets
and source codes of this benchmark are all available at
https://github.com/DeepGraphLearning/PEER_BenchmarkComment: Accepted by NeurIPS 2022 Dataset and Benchmark Track. arXiv v2:
source code released; arXiv v1: release all benchmark result