5,445 research outputs found

    An integrated approach to the interpretation of Single Amino Acid Polymorphisms within the framework of CATH and Gene3D

    Get PDF
    Background: The phenotypic effects of sequence variations in protein-coding regions come about primarily via their effects on the resulting structures, for example by disrupting active sites or affecting structural stability. In order better to understand the mechanisms behind known mutant phenotypes, and predict the effects of novel variations, biologists need tools to gauge the impacts of DNA mutations in terms of their structural manifestation. Although many mutations occur within domains whose structure has been solved, many more occur within genes whose protein products have not been structurally characterized.Results: Here we present 3DSim (3D Structural Implication of Mutations), a database and web application facilitating the localization and visualization of single amino acid polymorphisms (SAAPs) mapped to protein structures even where the structure of the protein of interest is unknown. The server displays information on 6514 point mutations, 4865 of them known to be associated with disease. These polymorphisms are drawn from SAAPdb, which aggregates data from various sources including dbSNP and several pathogenic mutation databases. While the SAAPdb interface displays mutations on known structures, 3DSim projects mutations onto known sequence domains in Gene3D. This resource contains sequences annotated with domains predicted to belong to structural families in the CATH database. Mappings between domain sequences in Gene3D and known structures in CATH are obtained using a MUSCLE alignment. 1210 three-dimensional structures corresponding to CATH structural domains are currently included in 3DSim; these domains are distributed across 396 CATH superfamilies, and provide a comprehensive overview of the distribution of mutations in structural space.Conclusion: The server is publicly available at http://3DSim.bioinfo.cnio.es/. In addition, the database containing the mapping between SAAPdb, Gene3D and CATH is available on request and most of the functionality is available through programmatic web service access

    The dynamic organization of fungal acetyl-CoA carboxylase

    Get PDF
    Acetyl-CoA carboxylases (ACCs) catalyse the committed step in fatty-acid biosynthesis: the ATP-dependent carboxylation of acetyl-CoA to malonyl-CoA. They are important regulatory hubs for metabolic control and relevant drug targets for the treatment of the metabolic syndrome and cancer. Eukaryotic ACCs are single-chain multienzymes characterized by a large, non-catalytic central domain (CD), whose role in ACC regulation remains poorly characterized. Here we report the crystal structure of the yeast ACC CD, revealing a unique four-domain organization. A regulatory loop, which is phosphorylated at the key functional phosphorylation site of fungal ACC, wedges into a crevice between two domains of CD. Combining the yeast CD structure with intermediate and low-resolution data of larger fragments up to intact ACCs provides a comprehensive characterization of the dynamic fungal ACC architecture. In contrast to related carboxylases, large-scale conformational changes are required for substrate turnover, and are mediated by the CD under phosphorylation control

    The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution

    Get PDF
    We report the latest release (version 3.0) of the CATH protein domain database (). There has been a 20% increase in the number of structural domains classified in CATH, up to 86 151 domains. Release 3.0 comprises 1110 fold groups and 2147 homologous superfamilies. To cope with the increases in diverse structural homologues being determined by the structural genomics initiatives, more sensitive methods have been developed for identifying boundaries in multi-domain proteins and for recognising homologues. The CATH classification update is now being driven by an integrated pipeline that links these automated procedures with validation steps, that have been made easier by the provision of information rich web pages summarising comparison scores and relevant links to external sites for each domain being classified. An analysis of the population of domains in the CATH hierarchy and several domain characteristics are presented for version 3.0. We also report an update of the CATH Dictionary of homologous structures (CATH-DHS) which now contains multiple structural alignments, consensus information and functional annotations for 1459 well populated superfamilies in CATH. CATH is directly linked to the Gene3D database which is a projection of CATH structural data onto ∼2 million sequences in completed genomes and UniProt

    Analysis on conservation of disulphide bonds and their structural features in homologous protein domain families

    Get PDF
    International audienceBackground: Disulphide bridges are well known to play key roles in stability, folding and functions of proteins. Introduction or deletion of disulphides by site-directed mutagenesis have produced varying effects on stability and folding depending upon the protein and location of disulphide in the 3-D structure. Given the lack of complete understanding it is worthwhile to learn from an analysis of extent of conservation of disulphides in homologous proteins. We have also addressed the question of what structural interactions replaces a disulphide in a homologue in another homologue.Results: Using a dataset involving 34,752 pairwise comparisons of homologous protein domains corresponding to 300 protein domain families of known 3-D structures, we provide a comprehensive analysis of extent of conservation of disulphide bridges and their structural features. We report that only 54% of all the disulphide bonds compared between the homologous pairs are conserved, even if, a small fraction of the non-conserved disulphides do include cytoplasmic proteins. Also, only about one fourth of the distinct disulphides are conserved in all the members in protein families. We note that while conservation of disulphide is common in many families, disulphide bond mutations are quite prevalent. Interestingly, we note that there is no clear relationship between sequence identity between two homologous proteins and disulphide bond conservation. Our analysis on structural features at the sites where cysteines forming disulphide in one homologue are replaced by non-Cys residues show that the elimination of a disulphide in a homologue need not always result in stabilizing interactions between equivalent residues.Conclusion: We observe that in the homologous proteins, disulphide bonds are conserved only to a modest extent. Very interestingly, we note that extent of conservation of disulphide in homologous proteins is unrelated to the overall sequence identity between homologues. The non-conserved disulphides are often associated with variable structural features that were recruited to be associated with differentiation or specialisation of protein function

    Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

    Get PDF
    The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa

    SecretePEPPr: Computational Prediction and Characterization of Effector-like Proteins Secreted from Vitis vinifera

    Get PDF
    Plant pathology research has long placed a focus on pathogen-derived effector proteins: small, secreted proteins translocated into host cells where they subvert host basal immunity and promote infection. Recent studies suggest that some plant species secrete similar effector-like proteins during mutualistic plant-fungal interactions and affect fungal growth. In this project, a computational tool named SecretePEPPr (Secreted Plant Effector-like Protein Predictor) was written, evaluated, and tested for the purpose of predicting candidate plant effector-like proteins from a set of whole genome annotations. Among other factors, this prediction tool considered classical and non-classical secretion, protein size, and the prevalence and presence of clathrin-mediated endocytic motifs. Analysis on testing data revealed a subcellular localization prediction specificity of 90% on a set of over 500 intracellular plant proteins and sensitivity of 55% on a set of experimentally validated secreted plant proteins. Across four analyzed grape proteomes, several germin-like proteins were identified as potentially haustorially localized through clathrin-mediated endocytic means. Protein length distributions revealed that effector-like candidates containing clathrin-mediated endocytic motifs were mostly in the 150-300 amino acid length range. Follow up in vivo validation was conducted in Erysiphe necator-infected Chardonnay grape leaves. Through this process, the first Erysiphe necator haustorial extraction method was devised using a Percoll Density Gradient followed by fluorescent labeling and fluorescence-activated cell sorting (FACS), resulting in 14 and 18 million purified fungal haustorial cells from 20 grams of heavily infected leaves. Computational streamlining of plant effector-like protein prediction established in this project provides a foundation for the top-down discovery and characterization of this recently discovered class of plant proteins, which have important implications in plant-microbe interactions and may act as targets for breeding and gene editing in plants

    SUPERFAMILY—sophisticated comparative genomics, data mining, visualization and phylogeny

    Get PDF
    SUPERFAMILY provides structural, functional and evolutionary information for proteins from all completely sequenced genomes, and large sequence collections such as UniProt. Protein domain assignments for over 900 genomes are included in the database, which can be accessed at http://supfam.org/. Hidden Markov models based on Structural Classification of Proteins (SCOP) domain definitions at the superfamily level are used to provide structural annotation. We recently produced a new model library based on SCOP 1.73. Family level assignments are also available. From the web site users can submit sequences for SCOP domain classification; search for keywords such as superfamilies, families, organism names, models and sequence identifiers; find over- and underrepresented families or superfamilies within a genome relative to other genomes or groups of genomes; compare domain architectures across selections of genomes and finally build multiple sequence alignments between Protein Data Bank (PDB), genomic and custom sequences. Recent extensions to the database include InterPro abstracts and Gene Ontology terms for superfamiles, taxonomic visualization of the distribution of families across the tree of life, searches for functionally similar domain architectures and phylogenetic trees. The database, models and associated scripts are available for download from the ftp site

    Characterization of Major Surface Protease Homologues of Trypanosoma congolense

    Get PDF
    Trypanosomes encode a family of proteins known as Major Surface Metalloproteases (MSPs). We have identified six putative MSPs encoded within the partially sequenced T. congolense genome. Phylogenic analysis indicates that T. congolense MSPs belong to five subfamilies that are conserved among African trypanosome species. Molecular modeling, based on the known structure of Leishmania Major GP63, reveals subfamily-specific structural variations around the putative active site despite conservation of overall structure, suggesting that each MSP subfamily has evolved to recognize distinct substrates. We have cloned and purified a protein encoding the amino-terminal domain of the T. congolense homologue TcoMSP-D (most closely related to Leishmania GP63). We detect TcoMSP-D in the serum of T. congolense-infected mice. Mice immunized with the amino-terminal domain of TcoMSP-D generate a persisting IgG1 antibody response. Surprisingly, a low-dose challenge of immunized mice with T. congolense significantly increases susceptibility to infection, indicating that immunity to TcoMSP-D is a factor affecting virulence
    corecore