1,937 research outputs found
Backdiff: a diffusion model for generalized transferable protein backmapping
Coarse-grained (CG) models play a crucial role in the study of protein
structures, protein thermodynamic properties, and protein conformation
dynamics. Due to the information loss in the coarse-graining process,
backmapping from CG to all-atom configurations is essential in many protein
design and drug discovery applications when detailed atomic representations are
needed for in-depth studies. Despite recent progress in data-driven backmapping
approaches, devising a backmapping method that can be universally applied
across various CG models and proteins remains unresolved. In this work, we
propose BackDiff, a new generative model designed to achieve generalization and
reliability in the protein backmapping problem. BackDiff leverages the
conditional score-based diffusion model with geometric representations. Since
different CG models can contain different coarse-grained sites which include
selected atoms (CG atoms) and simple CG auxiliary functions of atomistic
coordinates (CG auxiliary variables), we design a self-supervised training
framework to adapt to different CG atoms, and constrain the diffusion sampling
paths with arbitrary CG auxiliary variables as conditions. Our method
facilitates end-to-end training and allows efficient sampling across different
proteins and diverse CG models without the need for retraining. Comprehensive
experiments over multiple popular CG models demonstrate BackDiff's superior
performance to existing state-of-the-art approaches, and generalization and
flexibility that these approaches cannot achieve. A pretrained BackDiff model
can offer a convenient yet reliable plug-and-play solution for protein
researchers, enabling them to investigate further from their own CG models.Comment: 22 pages, 5 figure
- …