66 research outputs found

    A rigorous method for multigenic families' functional annotation: the peptidyl arginine deiminase (PADs) proteins family example

    Get PDF
    BACKGROUND: large scale and reliable proteins' functional annotation is a major challenge in modern biology. Phylogenetic analyses have been shown to be important for such tasks. However, up to now, phylogenetic annotation did not take into account expression data (i.e. ESTs, Microarrays, SAGE, ...). Therefore, integrating such data, like ESTs in phylogenetic annotation could be a major advance in post genomic analyses. We developed an approach enabling the combination of expression data and phylogenetic analysis. To illustrate our method, we used an example protein family, the peptidyl arginine deiminases (PADs), probably implied in Rheumatoid Arthritis. RESULTS: the analysis was performed as follows: we built a phylogeny of PAD proteins from the NCBI's NR protein database. We completed the phylogenetic reconstruction of PADs using an enlarged sequence database containing translations of ESTs contigs. We then extracted all corresponding expression data contained in EST database This analysis allowed us 1/To extend the spectrum of homologs-containing species and to improve the reconstruction of genes' evolutionary history. 2/To deduce an accurate gene expression pattern for each member of this protein family. 3/To show a correlation between paralogous sequences' evolution rate and pattern of tissular expression. CONCLUSION: coupling phylogenetic reconstruction and expression data is a promising way of analysis that could be applied to all multigenic families to investigate the relationship between molecular and transcriptional evolution and to improve functional annotation

    PhyloPattern: regular expressions to identify complex patterns in phylogenetic trees

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>To effectively apply evolutionary concepts in genome-scale studies, large numbers of phylogenetic trees have to be automatically analysed, at a level approaching human expertise. Complex architectures must be recognized within the trees, so that associated information can be extracted.</p> <p>Results</p> <p>Here, we present a new software library, PhyloPattern, for automating tree manipulations and analysis. PhyloPattern includes three main modules, which address essential tasks in high-throughput phylogenetic tree analysis: node annotation, pattern matching, and tree comparison. PhyloPattern thus allows the programmer to focus on: i) the use of predefined or user defined annotation functions to perform immediate or deferred evaluation of node properties, ii) the search for user-defined patterns in large phylogenetic trees, iii) the pairwise comparison of trees by dynamically generating patterns from one tree and applying them to the other.</p> <p>Conclusion</p> <p>PhyloPattern greatly simplifies and accelerates the work of the computer scientist in the evolutionary biology field. The library has been used to automatically identify phylogenetic evidence for domain shuffling or gene loss events in the evolutionary histories of protein sequences. However any workflow that relies on phylogenetic tree analysis, could be automated with PhyloPattern.</p

    GLADX: An Automated Approach to Analyze the Lineage-Specific Loss and Pseudogenization of Genes

    Get PDF
    A well-established ancestral gene can usually be found, in one or multiple copies, in different descendant species. Sometimes during the course of evolution, all the representatives of a well-established ancestral gene disappear in specific lineages; such gene losses may occur in the genome by deletion of a DNA fragment or by pseudogenization. The loss of an entire gene family in a given lineage may reflect an important phenomenon, and could be due either to adaptation, or to a relaxation of selection that leads to neutral evolution. Therefore, the lineage-specific gene loss analyses are important to improve the understanding of the evolutionary history of genes and genomes. In order to perform this kind of study from the increasing number of complete genome sequences available, we developed a unique new software module called GLADX in the DAGOBAH framework, based on a comparative genomic approach. The software is able to automatically detect, for all the species of a phylum, the presence/absence of a representative of a well-established ancestral gene, and by systematic steps of re-annotation, confirm losses, detect and analyze pseudogenes and find novel genes. The approach is based on the use of highly reliable gene phylogenies, of protein predictions and on the analysis of genomic mutations. All the evidence associated to evolutionary approach provides accurate information for building an overall view of the evolution of a given gene in a selected phylum. The reliability of GLADX has been successfully tested on a benchmark analysis of 14 reported cases. It is the first tool that is able to fully automatically study the lineage-specific losses and pseudogenizations. GLADX is available at http://ioda.univ-provence.fr/IodaSite/gladx/

    The genome of the white-rot fungus Pycnoporus cinnabarinus : a basidiomycete model with a versatile arsenal for lignocellulosic biomass breakdown

    Get PDF
    Background: Saprophytic filamentous fungi are ubiquitous micro-organisms that play an essential role in photosynthetic carbon recycling. The wood-decayer Pycnoporus cinnabarinus is a model fungus for the study of plant cell wall decomposition and is used for a number of applications in green and white biotechnology.Results: The 33.6 megabase genome of P. cinnabarinus was sequenced and assembled, and the 10,442predicted genes were functionally annotated using a phylogenomic procedure. In-depth analyses were carried out for the numerous enzyme families involved in lignocellulosic biomass breakdown, for protein secretion and glycosylation pathways, and for mating type. The P. cinnabarinus genome sequence revealed a consistent repertoire of genes shared with wood-decaying basidiomycetes. P. cinnabarinus is thus fully equipped with the classical families involved in cellulose and hemicellulose degradation, whereas its pectinolytic repertoire appears relatively limited. In addition, P. cinnabarinus possesses a complete versatile enzymatic arsenal for lignin breakdown. We identified several genes encoding members of the three ligninolytic peroxidase types, namely lignin peroxidase, manganese peroxidase and versatile peroxidase. Comparative genome analyses were performed in fungi displaying different nutritional strategies (white-rot and brown-rot modes of decay). P. cinnabarinus presents a typical distribution of all thespecific families found in the white-rot life style. Growth profiling of P. cinnabarinus was performed on 35 carbon sources including simple and complex substrates to study substrate utilization and preferences. P. cinnabarinus grew faster on crude plant substrates than on pure, mono- or polysaccharide substrates. Finally, proteomic analyses were conducted from liquid and solid-state fermentation to analyze the composition of the secretomes corresponding to growth on different substrates. The distribution of lignocellulolytic enzymes in the secretomes was strongly dependent on growth conditions, especially for lytic polysaccharide mono-oxygenases.Conclusions: With its available genome sequence, P. cinnabarinus is now an outstanding model system for the study of the enzyme machinery involved in the degradation or transformation of lignocellulosic biomass.Microbial Biotechnolog

    CpG islands and HTF islands in the HLA class I region: investigation of the methylation status of class I genes leads to precise physical mapping of the HLA-B and -C genes.

    No full text
    We have investigated the accessibility of the 5' CpG rich sequences (CpG islands) present in the 5' region of most if not all HLA class I genes to methylation sensitive rare cutter enzymes. We show that for HLA-A, -B, -C genes and a few other (but not all) class I sequences these CpG islands are unmethylated and therefore constitute HTF islands (CpG rich, unmethylated regions of DNA, usually associated with expressed genes). We then map precisely the HTF islands of the HLA-B and HLA-C genes and determine that they are separated by 130 Kb (in agreement with genetic data) and that these two genes are in the same transcriptional orientation on the chromosome

    Leukocyte Ig-like receptor complex (LRC) in mice and me

    No full text
    Here, we compare the architecture of membrane receptors with extracellular Ig-like domains located within the leukocyte Ig-like receptor complex (LRC) of humans and mice. The receptors can be classified broadly into four groups, based on the homology of their Ig-like domains and gene architecture. Receptors in the first group are characterized by the presence of the Ig constant type 2-1 (IgC2-1) and variant Ig (vIg) domains, and include the leukocyte Ig-like receptors (LILRs) and murine paired Ig-activating receptors (PIRs). The second group of receptors possess an IgC2-2 domain and comprise the killer-cell Ig-like receptors (KIRs) and platelet collagen receptor glycoprotein VI (GPVI). The third group consists of receptors with IgC2-1, and IgC2-3 or IgC2-4 domains, and includes the receptor for IgA Fc (FCAR), NKp46 and murine Ly94. The fourth group, with a single extracellular IgC2-1 domain, consists of the leukocyte-associated Ig-like receptors (LAIRs). The genomic organization of and evolutionary associations between these receptors and their domains are examined
    • …
    corecore