6 research outputs found

    Learning to Evolve Structural Ensembles of Unfolded and Disordered Proteins Using Experimental Solution Data

    Full text link
    We have developed a Generative Recurrent Neural Networks (GRNN) that learns the probability of the next residue torsions $X_{i+1}=\ [\phi_{i+1},\psi_{i+1},\omega _{i+1}, \chi_{i+1}]fromthepreviousresidueinthesequence from the previous residue in the sequence X_i$ to generate new IDP conformations. In addition, we couple the GRNN with a Bayesian model, X-EISD, in a reinforcement learning step that biases the probability distributions of torsions to take advantage of experimental data types such as J-couplingss, NOEs and PREs. We show that updating the generative model parameters according to the reward feedback on the basis of the agreement between structures and data improves upon existing approaches that simply reweight static structural pools for disordered proteins. Instead the GRNN "DynamICE" model learns to physically change the conformations of the underlying pool to those that better agree with experiment

    Conformational Ensembles of an Intrinsically Disordered Protein Consistent with NMR, SAXS, and Single-Molecule FRET

    No full text
    Intrinsically disordered proteins (IDPs) have fluctuating heterogeneous conformations, which makes their structural characterization challenging. Although challenging, characterization of the conformational ensembles of IDPs is of great interest, since their conformational ensembles are the link between their sequences and functions. An accurate description of IDP conformational ensembles depends crucially on the amount and quality of the experimental data, how it is integrated, and if it supports a consistent structural picture. We used integrative modeling and validation to apply conformational restraints and assess agreement with the most common structural techniques for IDPs: Nuclear Magnetic Resonance (NMR) spectroscopy, Small-angle X-ray Scattering (SAXS), and single-molecule Förster Resonance Energy Transfer (smFRET). Agreement with such a diverse set of experimental data suggests that details of the generated ensembles can now be examined with a high degree of confidence. Using the disordered N-terminal region of the Sic1 protein as a test case, we examined relationships between average global polymeric descriptions and higher-moments of their distributions. To resolve apparent discrepancies between smFRET and SAXS inferences, we integrated SAXS data with NMR data and reserved the smFRET data for independent validation. Consistency with smFRET, which was not guaranteed a priori, indicates that, globally, the perturbative effects of NMR or smFRET labels on the Sic1 ensemble are minimal. Analysis of the ensembles revealed distinguishing features of Sic1, such as overall compactness and large end-to-end distance fluctuations, which are consistent with biophysical models of Sic1's ultrasensitive binding to its partner Cdc4. Our results underscore the importance of integrative modeling and validation in generating and drawing conclusions from IDP conformational ensembles

    IDPConformerGenerator: A Flexible Software Suite for Sampling the Conformational Space of Disordered Protein States.

    No full text
    The power of structural information for informing biological mechanisms is clear for stable folded macromolecules, but similar structure-function insight is more difficult to obtain for highly dynamic systems such as intrinsically disordered proteins (IDPs) which must be described as structural ensembles. Here, we present IDPConformerGenerator, a flexible, modular open-source software platform for generating large and diverse ensembles of disordered protein states that builds conformers that obey geometric, steric, and other physical restraints on the input sequence. IDPConformerGenerator samples backbone phi (φ), psi (ψ), and omega (ω) torsion angles of relevant sequence fragments from loops and secondary structure elements extracted from folded protein structures in the RCSB Protein Data Bank and builds side chains from robust Monte Carlo algorithms using expanded rotamer libraries. IDPConformerGenerator has many user-defined options enabling variable fractional sampling of secondary structures, supports Bayesian models for assessing the agreement of IDP ensembles for consistency with experimental data, and introduces a machine learning approach to transform between internal and Cartesian coordinates with reduced error. IDPConformerGenerator will facilitate the characterization of disordered proteins to ultimately provide structural insights into these states that have key biological functions
    corecore