844 research outputs found

    Predicting protein function with hierarchical phylogenetic profiles: The Gene3D phylo-tuner method applied to eukaryotic Genomes

    Get PDF
    "Phylogenetic profiling'' is based on the hypothesis that during evolution functionally or physically interacting genes are likely to be inherited or eliminated in a codependent manner. Creating presence-absence profiles of orthologous genes is now a common and powerful way of identifying functionally associated genes. In this approach, correctly determining orthology, as a means of identifying functional equivalence between two genes, is a critical and nontrivial step and largely explains why previous work in this area has mainly focused on using presence-absence profiles in prokaryotic species. Here, we demonstrate that eukaryotic genomes have a high proportion of multigene families whose phylogenetic profile distributions are poor in presence-absence information content. This feature makes them prone to orthology mis-assignment and unsuited to standard profile-based prediction methods. Using CATH structural domain assignments from the Gene3D database for 13 complete eukaryotic genomes, we have developed a novel modification of the phylogenetic profiling method that uses genome copy number of each domain superfamily to predict functional relationships. In our approach, superfamilies are subclustered at ten levels of sequence identity from 30% to 100% - and phylogenetic profiles built at each level. All the profiles are compared using normalised Euclidean distances to identify those with correlated changes in their domain copy number. We demonstrate that two protein families will "auto-tune'' with strong co-evolutionary signals when their profiles are compared at the similarity levels that capture their functional relationship. Our method finds functional relationships that are not detectable by the conventional presence - absence profile comparisons, and it does not require a priori any fixed criteria to define orthologous genes

    Assessing functional novelty of PSI structures via structure-function analysis of large and diverse superfamilies

    Get PDF
    The structural genomics initiatives have had as one of their aims to improve our understanding of protein function by providing representative structures for many structurally uncharacterised protein families. As suggested by the recent assessment of the Protein Structure Initiative (Structural Genomics Initiative, funded by the NIH), doubts have arisen as to whether Structural Genomics as initially planned were really beneficial to our understanding of biological issues, and in particular of protein function.
A few protein domain superfamilies have been shown to account for unexpectedly large numbers of proteins encoded in fully sequenced genomes. These large superfamilies are generally very diverse, spanning a wide range of functions, both in terms of molecular activities and biological processes. Some of these superfamilies, such as the Rossmann-fold P-loop nucleotide hydrolases or the TIM-barrel glycosidases, have been the subject of extensive structural studies which in turn have shed light on how evolution of the sequence and structure properties produce functional diversity amongst homologues. Recently, the Structure-Function Linkage Database (SFLD) has been setup with the aim of helping the study of structure-function correlations in such superfamilies. Since the evolutionary success of these large superfamilies suggests biological importance, several Structural Genomics Centers have focused on providing full structural coverage for representatives of all sequence families in these superfamilies.
In this work we evaluate structure/function diversity in a set of these large superfamilies and attempt to assess the quality and quantity of biological information gained from Structural Genomics.
&#xa

    Yeast cytochrome c oxidase: a model system to study mitochondrial forms of the haem-copper oxidase superfamily.

    Get PDF
    The known subunits of yeast mitochondrial cytochrome c oxidase are reviewed. The structures of all eleven of its subunits are explored by building homology models based on the published structures of the homologous bovine subunits and similarities and differences are highlighted, particularly of the core functional subunit I. Yeast genetic techniques to enable introduction of mutations into the three core mitochondrially-encoded subunits are reviewed

    Crystal structures of the human Dysferlin inner DysF domain

    Get PDF
    Background: Mutations in dysferlin, the first protein linked with the cell membrane repair mechanism, causes a group of muscular dystrophies called dysferlinopathies. Dysferlin is a type two-anchored membrane protein, with a single C terminal trans-membrane helix, and most of the protein lying in cytoplasm. Dysferlin contains several C2 domains and two DysF domains which are nested one inside the other. Many pathogenic point mutations fall in the DysF domain region. Results: We describe the crystal structure of the human dysferlin inner DysF domain with a resolution of 1.9 Angstroms. Most of the pathogenic mutations are part of aromatic/arginine stacks that hold the domain in a folded conformation. The high resolution of the structure show that these interactions are a mixture of parallel ring/guanadinium stacking, perpendicular H bond stacking and aliphatic chain packing. Conclusions: The high resolution structure of the Dysferlin DysF domain gives a template on which to interpret in detail the pathogenic mutations that lead to disease

    Co-Expression Network Models Suggest that Stress Increases Tolerance to Mutations

    Get PDF
    Network models are a well established tool for studying the robustness of complex systems, including modelling the effect of loss of function mutations in protein interaction networks. Past work has concentrated on average damage caused by random node removal, with little attention to the shape of the damage distribution. In this work, we use fission yeast co-expression networks before and after exposure to stress to model the effect of stress on mutational robustness. We find that exposure to stress decreases the average damage from node removal, suggesting stress induces greater tolerance to loss of function mutations. The shape of the damage distribution is also changed upon stress, with a greater incidence of extreme damage after exposure to stress. We demonstrate that the change in shape of the damage distribution can have considerable functional consequences, highlighting the need to consider the damage distribution in addition to average behaviour

    Pattern matching and pattern discovery algorithms for protein topologies

    Get PDF
    We describe algorithms for pattern matching and pattern learning in TOPS diagrams (formal descriptions of protein topologies). These problems can be reduced to checking for subgraph isomorphism and finding maximal common subgraphs in a restricted class of ordered graphs. We have developed a subgraph isomorphism algorithm for ordered graphs, which performs well on the given set of data. The maximal common subgraph problem then is solved by repeated subgraph extension and checking for isomorphisms. Despite the apparent inefficiency such approach gives an algorithm with time complexity proportional to the number of graphs in the input set and is still practical on the given set of data. As a result we obtain fast methods which can be used for building a database of protein topological motifs, and for the comparison of a given protein of known secondary structure against a motif database

    An integrated approach to the interpretation of Single Amino Acid Polymorphisms within the framework of CATH and Gene3D

    Get PDF
    Background: The phenotypic effects of sequence variations in protein-coding regions come about primarily via their effects on the resulting structures, for example by disrupting active sites or affecting structural stability. In order better to understand the mechanisms behind known mutant phenotypes, and predict the effects of novel variations, biologists need tools to gauge the impacts of DNA mutations in terms of their structural manifestation. Although many mutations occur within domains whose structure has been solved, many more occur within genes whose protein products have not been structurally characterized.Results: Here we present 3DSim (3D Structural Implication of Mutations), a database and web application facilitating the localization and visualization of single amino acid polymorphisms (SAAPs) mapped to protein structures even where the structure of the protein of interest is unknown. The server displays information on 6514 point mutations, 4865 of them known to be associated with disease. These polymorphisms are drawn from SAAPdb, which aggregates data from various sources including dbSNP and several pathogenic mutation databases. While the SAAPdb interface displays mutations on known structures, 3DSim projects mutations onto known sequence domains in Gene3D. This resource contains sequences annotated with domains predicted to belong to structural families in the CATH database. Mappings between domain sequences in Gene3D and known structures in CATH are obtained using a MUSCLE alignment. 1210 three-dimensional structures corresponding to CATH structural domains are currently included in 3DSim; these domains are distributed across 396 CATH superfamilies, and provide a comprehensive overview of the distribution of mutations in structural space.Conclusion: The server is publicly available at http://3DSim.bioinfo.cnio.es/. In addition, the database containing the mapping between SAAPdb, Gene3D and CATH is available on request and most of the functionality is available through programmatic web service access

    Control químico de las malezas en ajíes dulces

    Get PDF
    Two herbicide experiments with sweet cherry peppers were conducted in a Fraternidad clay at Lajas Substation from 1989 to 1990. In the first experiment, clomazone at 1.68 and 3.36 kg ai/ha applied pre-plant and incorporated, as well as fluazifop at 0.42 and 0.84 applied postemergence, gave excellent control of most grasses. Oxyfluorfen at 0.20 kg ai/ha as a pre-plant followed by bentazon at 1.12 kg ai/ha and fluazifop at 0.42 kg ai/ha also provided excellent weed control. The highest sweet cherry pepper yield was obtained with the hand-weeded control. It was then followed by oxyfluorfen at 0,20 kg ai/ha, then followed by bentazon + fluazifop mixture. The treatment with fluazifop at 0.42 kg ai/ha as a postemergence + one supplementary handweeding ranked third in yield. The above treatments did not differ significantly in yield. In the second experiment, colmazone at 2.24 kg ai/ha as a pre-plant, followed by three post directed application of paraquat at 0.56 kg ai/ha, gave the best weed control. Oxyfluorfen at 0.56 kg ai/ha as a pre-plant followed by bentazon at 1.12 kg ai/ha and fluazifop at 0.42 kg ai/ha mixture, also gave excellent weed control. The highest cherry pepper yield was obtained with clomazone at 2.24 kg ai/ha plus three applications of paraquat at 0.56kg ai/ha. This yield was followed by that with oxyfluorfen at 0.56 kg ai/ha followed by bentazone and fluazifop mixture. Napropamide at 2.24 kg ai/ha as a pre-plant, followed by three post directed applications of paraquat at 0.56 kg ai/ha, was ranked third. A good yield was also obtained with Napropamide at 2.24 kg ai/ha as a pre-plant followed by bentazon at 2.24 kg ai/ha and fluazifop at 0.42 kg ai/ha mixture. Yields did not differ significantly with these treatments.En un suelo Fraternidad de la Substación Experimental Agrícola de Lajas se realizaron dos experimentos con herbicidas en ají dulce de 1989 a 1990. En el primer experimento se encontró que clomazone a razón de 1.68 y 3.36 kg. p.a./ha. aplicado presiembra e incorporado como el fluazifop-P a razón de 0.42 y 0.84 kg. p.a./ha. como posemergente reprimieron las gramíneas eficazmente hasta 6 semanas después del trasplante. El oxyfiuorfen a razón de 0.20 kg. p.a./ha. como presiembra seguido por la mezcla bentazon a 1.12 kg. p.a./ha. proveyó un control excelente de la mayoría de las malezas. El rendimiento más alto de ajíes se obtuvo con el desyerbo a mano. A este rendimiento le seguió el del tratamiento de oxifluorfen a razón de 0.20 kg. p.a./ha. como presiembra + la mezcla de bentazon a 1.12 kg. p.a./ha. y fluazifop-P a 0.42 kg. p.a./ha. como posemergente. El fluazifop-P a razón de 0.42 kg. p.a./ha. + un desyerbo suplementario a mano fue tercero en rendimiento. No hubo diferencias estadísticamente significativas (0.05) de probabilidad con estos tres tratamientos. En el segundo experimento, clomazone a razón de 2.24 kg. p.a./ha. como presiembra seguido por el paraquat a razón de 0.56 kg. p.a./ha. (tres veces dirigidos) controló excelentemente la mayoría de las malezas. Oxifluorfen a razón de 0.56 kg. p.a./ha. como presiembra seguido por la mezcla de bentazon a 2.24 kg. p.a./ha. y fluazifop-P a 0.42 kg. p.a./ha. como posemergente controló eficientemente. El rendimiento más alto se obtuvo con el clomazone a razón de 2.24 kg. p.a./ha. como presiembra seguido por la aplicación dirigida de paraquat a razón de 0.56 kg p.a./ha. A este rendimiento le siguió el de oxifluorfen a razón de 0.56 kg. p.a./ha. Napropamide a razón de 2.24 kg. p.a./ha. seguido por paraquat a 0.56 kg. p.a./ha. ocupó el tercer lugar. Napropamide a razón de 2.24 kg. p.a./ha. seguido por la mezcla de bentazon a 2.24 kg. p.a./ha. y fluazifop-P a 0.42 kg. p.a./ha. obtuvo el cuarto lugar. En estos cuatro tratamientos no hubo diferencias significativas (P = 0.05) en rendimiento

    The history of the CATH structural classification of protein domains

    Get PDF
    This article presents a historical review of the protein structure classification database CATH. Together with the SCOP database, CATH remains comprehensive and reasonably up-to-date with the now more than 100,000 protein structures in the PDB. We review the expansion of the CATH and SCOP resources to capture predicted domain structures in the genome sequence data and to provide information on the likely functions of proteins mediated by their constituent domains. The establishment of comprehensive function annotation resources has also meant that domain families can be functionally annotated allowing insights into functional divergence and evolution within protein families

    CATHEDRAL: A Fast and Effective Algorithm to Predict Folds and Domain Boundaries from Multidomain Protein Structures

    Get PDF
    We present CATHEDRAL, an iterative protocol for determining the location of previously observed protein folds in novel multidomain protein structures. CATHEDRAL builds on the features of a fast secondary-structure–based method (using graph theory) to locate known folds within a multidomain context and a residue-based, double-dynamic programming algorithm, which is used to align members of the target fold groups against the query protein structure to identify the closest relative and assign domain boundaries. To increase the fidelity of the assignments, a support vector machine is used to provide an optimal scoring scheme. Once a domain is verified, it is excised, and the search protocol is repeated in an iterative fashion until all recognisable domains have been identified. We have performed an initial benchmark of CATHEDRAL against other publicly available structure comparison methods using a consensus dataset of domains derived from the CATH and SCOP domain classifications. CATHEDRAL shows superior performance in fold recognition and alignment accuracy when compared with many equivalent methods. If a novel multidomain structure contains a known fold, CATHEDRAL will locate it in 90% of cases, with <1% false positives. For nearly 80% of assigned domains in a manually validated test set, the boundaries were correctly delineated within a tolerance of ten residues. For the remaining cases, previously classified domains were very remotely related to the query chain so that embellishments to the core of the fold caused significant differences in domain sizes and manual refinement of the boundaries was necessary. To put this performance in context, a well-established sequence method based on hidden Markov models was only able to detect 65% of domains, with 33% of the subsequent boundaries assigned within ten residues. Since, on average, 50% of newly determined protein structures contain more than one domain unit, and typically 90% or more of these domains are already classified in CATH, CATHEDRAL will considerably facilitate the automation of protein structure classification
    corecore