2 research outputs found

    Enhancing Evolutionary Couplings with Deep Convolutional Neural Networks

    Get PDF
    While genes are defined by sequence, in biological systems a protein's function is largely determined by its three-dimensional structure. Evolutionary information embedded within multiple sequence alignments provides a rich source of data for inferring structural constraints on macromolecules. Still, many proteins of interest lack sufficient numbers of related sequences, leading to noisy, error-prone residue-residue contact predictions. Here we introduce DeepContact, a convolutional neural network (CNN)-based approach that discovers co-evolutionary motifs and leverages these patterns to enable accurate inference of contact probabilities, particularly when few related sequences are available. DeepContact significantly improves performance over previous methods, including in the CASP12 blind contact prediction task where we achieved top performance with another CNN-based approach. Moreover, our tool converts hard-to-interpret coupling scores into probabilities, moving the field toward a consistent metric to assess contact prediction across diverse proteins. Through substantially improving the precision-recall behavior of contact prediction, DeepContact suggests we are near a paradigm shift in template-free modeling for protein structure prediction. Many protein structures of interest remain out of reach for both computational prediction and experimental determination. DeepContact learns patterns of co-evolution across thousands of experimentally determined structures, identifying conserved local motifs and leveraging this information to improve protein residue-residue contact predictions. DeepContact extracts additional information from the evolutionary couplings using its knowledge of co-evolution and structural space, while also converting coupling scores into probabilities that are comparable across protein sequences and alignments. Keywords: contact prediction; convolutional neural networks; deep learning; protein structure prediction; structure prediction; co-evolution; evolutionary couplingsNational Institutes of Health (U.S.) (Grant R01GM081871

    Structured States of Disordered Proteins from Genomic Sequences

    Get PDF
    Protein flexibility ranges from simple hinge movements to functional disorder. Around half of all human proteins contain apparently disordered regions with little 3D or functional information, and many of these proteins are associated with disease. Building on the evolutionary couplings approach previously successful in predicting 3D states of ordered proteins and RNA, we developed a method to predict the potential for ordered states for all apparently disordered proteins with sufficiently rich evolutionary information. The approach is highly accurate (79%) for residue interactions as tested in more than 60 known disordered regions captured in a bound or specific condition. Assessing the potential for structure of more than 1,000 apparently disordered regions of human proteins reveals a continuum of structural order with at least 50% with clear propensity for three-or two-dimensional states. Co-evolutionary constraints reveal hitherto unseen structures of functional importance in apparently disordered proteins. Keywords: Evolutionary couplings disorder; conformational flexibility; statistical physics; maximum entropy; EVfold; bioinformatics; computational biology; structure predictionNational Institutes of Health (U.S.) (Grant R01GM081871
    corecore