6,187 research outputs found
Predicting mutational effects on protein-protein binding via a side-chain diffusion probabilistic model
Many crucial biological processes rely on networks of protein-protein
interactions. Predicting the effect of amino acid mutations on protein-protein
binding is vital in protein engineering and therapeutic discovery. However, the
scarcity of annotated experimental data on binding energy poses a significant
challenge for developing computational approaches, particularly deep
learning-based methods. In this work, we propose SidechainDiff, a
representation learning-based approach that leverages unlabelled experimental
protein structures. SidechainDiff utilizes a Riemannian diffusion model to
learn the generative process of side-chain conformations and can also give the
structural context representations of mutations on the protein-protein
interface. Leveraging the learned representations, we achieve state-of-the-art
performance in predicting the mutational effects on protein-protein binding.
Furthermore, SidechainDiff is the first diffusion-based generative model for
side-chains, distinguishing it from prior efforts that have predominantly
focused on generating protein backbone structures
DiffPack: A Torsional Diffusion Model for Autoregressive Protein Side-Chain Packing
Proteins play a critical role in carrying out biological functions, and their
3D structures are essential in determining their functions. Accurately
predicting the conformation of protein side-chains given their backbones is
important for applications in protein structure prediction, design and
protein-protein interactions. Traditional methods are computationally intensive
and have limited accuracy, while existing machine learning methods treat the
problem as a regression task and overlook the restrictions imposed by the
constant covalent bond lengths and angles. In this work, we present DiffPack, a
torsional diffusion model that learns the joint distribution of side-chain
torsional angles, the only degrees of freedom in side-chain packing, by
diffusing and denoising on the torsional space. To avoid issues arising from
simultaneous perturbation of all four torsional angles, we propose
autoregressively generating the four torsional angles from \c{hi}1 to \c{hi}4
and training diffusion models for each torsional angle. We evaluate the method
on several benchmarks for protein side-chain packing and show that our method
achieves improvements of 11.9% and 13.5% in angle accuracy on CASP13 and
CASP14, respectively, with a significantly smaller model size (60x fewer
parameters). Additionally, we show the effectiveness of our method in enhancing
side-chain predictions in the AlphaFold2 model. Code will be available upon the
accept.Comment: Under revie
Mass & secondary structure propensity of amino acids explain their mutability and evolutionary replacements
Why is an amino acid replacement in a protein accepted during evolution? The answer given by bioinformatics relies on the frequency of change of each amino acid by another one and the propensity of each to remain unchanged. We propose that these replacement rules are recoverable from the secondary structural trends of amino acids. A distance measure between high-resolution Ramachandran distributions reveals that structurally similar residues coincide with those found in substitution matrices such as BLOSUM: Asn Asp, Phe Tyr, Lys Arg, Gln Glu, Ile Val, Met → Leu; with Ala, Cys, His, Gly, Ser, Pro, and Thr, as structurally idiosyncratic residues. We also found a high average correlation (\overline{R} R = 0.85) between thirty amino acid mutability scales and the mutational inertia (I X ), which measures the energetic cost weighted by the number of observations at the most probable amino acid conformation. These results indicate that amino acid substitutions follow two optimally-efficient principles: (a) amino acids interchangeability privileges their secondary structural similarity, and (b) the amino acid mutability depends directly on its biosynthetic energy cost, and inversely with its frequency. These two principles are the underlying rules governing the observed amino acid substitutions. © 2017 The Author(s)
- …