108 research outputs found

    ProFunc: a server for predicting protein function from 3D structure

    Get PDF
    ProFunc () is a web server for predicting the likely function of proteins whose 3D structure is known but whose function is not. Users submit the coordinates of their structure to the server in PDB format. ProFunc makes use of both existing and novel methods to analyse the protein's sequence and structure identifying functional motifs or close relationships to functionally characterized proteins. A summary of the analyses provides an at-a-glance view of what each of the different methods has found. More detailed results are available on separate pages. Often where one method has failed to find anything useful another may be more forthcoming. The server is likely to be of most use in structural genomics where a large proportion of the proteins whose structures are solved are of hypothetical proteins of unknown function. However, it may also find use in a comparative analysis of members of large protein families. It provides a convenient compendium of sequence and structural information that often hold vital functional clues to be followed up experimentally

    MGOS: A library for molecular geometry and its operating system

    Get PDF
    The geometry of atomic arrangement underpins the structural understanding of molecules in many fields. However, no general framework of mathematical/computational theory for the geometry of atomic arrangement exists. Here we present "Molecular Geometry (MG)'' as a theoretical framework accompanied by "MG Operating System (MGOS)'' which consists of callable functions implementing the MG theory. MG allows researchers to model complicated molecular structure problems in terms of elementary yet standard notions of volume, area, etc. and MGOS frees them from the hard and tedious task of developing/implementing geometric algorithms so that they can focus more on their primary research issues. MG facilitates simpler modeling of molecular structure problems; MGOS functions can be conveniently embedded in application programs for the efficient and accurate solution of geometric queries involving atomic arrangements. The use of MGOS in problems involving spherical entities is akin to the use of math libraries in general purpose programming languages in science and engineering. (C) 2019 The Author(s). Published by Elsevier B.V

    FunTree: a resource for exploring the functional evolution of structurally defined enzyme superfamilies.

    Get PDF
    FunTree is a new resource that brings together sequence, structure, phylogenetic, chemical and mechanistic information for structurally defined enzyme superfamilies. Gathering together this range of data into a single resource allows the investigation of how novel enzyme functions have evolved within a structurally defined superfamily as well as providing a means to analyse trends across many superfamilies. This is done not only within the context of an enzyme's sequence and structure but also the relationships of their reactions. Developed in tandem with the CATH database, it currently comprises 276 superfamilies covering ~1800 (70%) of sequence assigned enzyme reactions. Central to the resource are phylogenetic trees generated from structurally informed multiple sequence alignments using both domain structural alignments supplemented with domain sequences and whole sequence alignments based on commonality of multi-domain architectures. These trees are decorated with functional annotations such as metabolite similarity as well as annotations from manually curated resources such the catalytic site atlas and MACiE for enzyme mechanisms. The resource is freely available through a web interface: www.ebi.ac.uk/thorton-srv/databases/FunTree

    Mapping the Constrained Coding Regions in the human genome to their corresponding proteins

    Get PDF
    Constrained Coding Regions (CCRs) in the human genome have been derived from DNA sequencing data of large cohorts of healthy control populations, available in the Genome Aggregation Database (gnomAD) [1]. They identify regions depleted of protein-changing variants and thus identify segments of the genome that have been constrained during human evolution. By mapping these DNA-defined regions from genomic coordinates onto the corresponding protein positions and combining this information with protein annotations, we have explored the distribution of CCRs and compared their co-occurrence with different protein functional features, previously annotated at the amino acid level in public databases. As expected, our results reveal that functional amino acids involved in interactions with DNA/RNA, protein-protein contacts and catalytic sites are the protein features most likely to be highly constrained for variation in the control population. More surprisingly, we also found that linear motifs, linear interacting peptides (LIPs), disorder-order transitions upon binding with other protein partners and liquid-liquid phase separating (LLPS) regions are also strongly associated with high constraint for variability. We also compared intra-species constraints in the human CCRs with inter-species conservation and functional residues to explore how such CCRs may contribute to the analysis of protein variants. As has been previously observed, CCRs are only weakly correlated with conservation, suggesting that intraspecies constraints complement interspecies conservation and can provide more information to interpret variant effects

    Chopping and Changing: the Evolution of the Flavin-dependent Monooxygenases.

    Get PDF
    Flavin-dependent monooxygenases play a variety of key physiological roles and are also very powerful biotechnological tools. These enzymes have been classified into eight different classes (A-H) based on their sequences and biochemical features. By combining structural and sequence analysis, and phylogenetic inference, we have explored the evolutionary history of classes A, B, E, F, and G and demonstrate that their multidomain architectures reflect their phylogenetic relationships, suggesting that the main evolutionary steps in their divergence are likely to have arisen from the recruitment of different domains. Additionally, the functional divergence within in each class appears to have been the result of other mechanisms such as a complex set of single-point mutations. Our results reinforce the idea that a main constraint on the evolution of cofactor-dependent enzymes is the functional binding of the cofactor. Additionally, a remarkable feature of this family is that the sequence of the key flavin adenine dinucleotide-binding domain is split into at least two parts in all classes studied here. We propose a complex set of evolutionary events that gave rise to the origin of the different classes within this family

    Impact of Structural Observables From Simulations to Predict the Effect of Single-Point Mutations in MHC Class II Peptide Binders

    Get PDF
    The prediction of peptide binders to Major Histocompatibility Complex (MHC) class II receptors is of great interest to study autoimmune diseases and for vaccine development. Most approaches predict the affinities using sequence-based models trained on experimental data and multiple alignments from known peptide substrates. However, detecting activity differences caused by single-point mutations is a challenging task. In this work, we used interactions calculated from simulations to build scoring matrices for quickly estimating binding differences by single-point mutations. We modelled a set of 837 peptides bound to an MHC class II allele, and optimized the sampling of the conformations using the Rosetta backrub method by comparing the results to molecular dynamics simulations. From the dynamic trajectories of each complex, we averaged and compared structural observables for each amino acid at each position of the 9°mer peptide core region. With this information, we generated the scoring-matrices to predict the sign of the binding differences. We then compared the performance of the best scoring-matrix to different computational methodologies that range in computational costs. Overall, the prediction of the activity differences caused by single mutated peptides was lower than 60% for all the methods. However, the developed scoring-matrix in combination with existing methods reports an increase in the performance, up to 86% with a scoring method that uses molecular dynamics
    corecore