92 research outputs found

    OrthoInspector: comprehensive orthology analysis and visual exploration

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The accurate determination of orthology and inparalogy relationships is essential for comparative sequence analysis, functional gene annotation and evolutionary studies. Various methods have been developed based on either simple blast all-versus-all pairwise comparisons and/or time-consuming phylogenetic tree analyses.</p> <p>Results</p> <p>We have developed OrthoInspector, a new software system incorporating an original algorithm for the rapid detection of orthology and inparalogy relations between different species. In comparisons with existing methods, OrthoInspector improves detection sensitivity, with a minimal loss of specificity. In addition, several visualization tools have been developed to facilitate in-depth studies based on these predictions. The software has been used to study the orthology/in-paralogy relationships for a large set of 940,855 protein sequences from 59 different eukaryotic species.</p> <p>Conclusion</p> <p>OrthoInspector is a new software system for orthology/paralogy analysis. It is made available as an independent software suite that can be downloaded and installed for local use. Command line querying facilitates the integration of the software in high throughput processing pipelines and a graphical interface provides easy, intuitive access to results for the non-expert.</p

    Tex19 and Sectm1 concordant molecular phylogenies support co-evolution of both eutherian-specific genes

    Get PDF
    International audienceBackground: Transposable elements (TE) have attracted much attention since they shape the genome and contribute to species evolution. Organisms have evolved mechanisms to control TE activity. Testis expressed 19 (Tex19) represses TE expression in mouse testis and placenta. In the human and mouse genomes, Tex19 and Secreted and transmembrane 1 (Sectm1) are neighbors but are not homologs. Sectm1 is involved in immunity and its molecular phylogeny is unknown. Methods: Using multiple alignments of complete protein sequences (MACS), we inferred Tex19 and Sectm1 molecular phylogenies. Protein conserved regions were identified and folds were predicted. Finally, expression patterns were studied across tissues and species using RNA-seq public data and RT-PCR. Results: We present 2 high quality alignments of 58 Tex19 and 58 Sectm1 protein sequences from 48 organisms. First, both genes are eutherian-specific, i.e., exclusively present in mammals except monotremes (platypus) and marsupials. Second, Tex19 and Sectm1 have both duplicated in Sciurognathi and Bovidae while they have remained as single copy genes in all further placental mammals. Phylogenetic concordance between both genes was significant (p-value < 0.05) and supported co-evolution and functional relationship. At the protein level, Tex19 exhibits 3 conserved regions and 4 invariant cysteines. In particular, a CXXC motif is present in the N-terminal conserved region. Sectm1 exhibits 2 invariant cysteines and an Ig-like domain. Strikingly, Tex19 C-terminal conserved region was lost in Haplorrhini primates while a Sectm1 C-terminal extra domain was acquired. Finally, we have determined that Tex19 and Sectm1 expression levels anti-correlate across the testis of several primates (ρ = −0.72) which supports anti-regulation. Conclusions: Tex19 and Sectm1 co-evolution and anti-regulated expressions support a strong functional relationship between both genes. Since Tex19 operates a control on TE and Sectm1 plays a role in immunity, Tex19 might suppress an immune response directed against cells that show TE activity in eutherian reproductive tissues

    Genome-wide evidence for an essential role of the human Staf/ZNF143 transcription factor in bidirectional transcription

    Get PDF
    In the human genome, ∼10% of the genes are arranged head to head so that their transcription start sites reside within <1 kbp on opposite strands. In this configuration, a bidirectional promoter generally drives expression of the two genes. How bidirectional expression is performed from these particular promoters constitutes a puzzling question. Here, by a combination of in silico and biochemical approaches, we demonstrate that hStaf/ZNF143 is involved in controlling expression from a subset of divergent gene pairs. The binding sites for hStaf/ZNF143 (SBS) are overrepresented in bidirectional versus unidirectional promoters. Chromatin immunoprecipitation assays with a significant set of bidirectional promoters containing putative SBS revealed that 93% of them are associated with hStaf/ZNF143. Expression of dual reporter genes directed by bidirectional promoters are dependent on the SBS integrity and requires hStaf/ZNF143. Furthermore, in some cases, functional SBS are located in bidirectional promoters of gene pairs encoding a noncoding RNA and a protein gene. Remarkably, hStaf/ZNF143 per se exhibits an inherently bidirectional transcription activity, and together our data provide the demonstration that hStaf/ZNF143 is indeed a transcription factor controlling the expression of divergent protein–protein and protein–non-coding RNA gene pairs

    Detecting the molecular scars of evolution in the Mycobacterium tuberculosis complex by analyzing interrupted coding sequences

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Computer-assisted analyses have shown that all bacterial genomes contain a small percentage of open reading frames with a frameshift or in-frame stop codon We report here a comparative analysis of these interrupted coding sequences (ICDSs) in six isolates of <it>M. tuberculosis</it>, two of <it>M. bovis </it>and one of <it>M. africanum </it>and question their phenotypic impact and evolutionary significance.</p> <p>Results</p> <p>ICDSs were classified as "common to all strains" or "strain-specific". Common ICDSs are believed to result from mutations acquired before the divergence of the species, whereas strain-specific ICDSs were acquired after this divergence. Comparative analyses of these ICDSs therefore define the molecular signature of a particular strain, phylogenetic lineage or species, which may be useful for inferring phenotypic traits such as virulence and molecular relationships. For instance, <it>in silico </it>analysis of the W-Beijing lineage of <it>M. tuberculosis</it>, an emergent family involved in several outbreaks, is readily distinguishable from other phyla by its smaller number of common ICDSs, including at least one known to be associated with virulence. Our observation was confirmed through the sequencing analysis of ICDSs in a panel of 21 clinical <it>M. tuberculosis </it>strains. This analysis further illustrates the divergence of the W-Beijing lineage from other phyla in terms of the number of full-length ORFs not containing a frameshift. We further show that ICDS formation is not associated with the presence of a mutated promoter, and suggest that promoter extinction is not the main cause of pseudogene formation.</p> <p>Conclusion</p> <p>The correlation between ICDSs, function and phenotypes could have important evolutionary implications. This study provides population geneticists with a list of targets, which could undergo selective pressure and thus alters relationships between the various lineages of <it>M. tuberculosis </it>strains and their host. This approach could be applied to any closely related bacterial strains or species for which several genome sequences are available.</p

    ICDS database: interrupted CoDing sequences in prokaryotic genomes

    Get PDF
    Unrecognized frameshifts, in-frame stop codons and sequencing errors lead to Interrupted CoDing Sequence (ICDS) that can seriously affect all subsequent steps of functional characterization, from in silico analysis to high-throughput proteomic projects. Here, we describe the Interrupted CoDing Sequence database containing ICDS detected by a similarity-based approach in 80 complete prokaryotic genomes. ICDS can be retrieved by species browsing or similarity searches via a web interface (). The definition of each interrupted gene is provided as well as the ICDS genomic localization with the surrounding sequence. Furthermore, to facilitate the experimental characterization of ICDS, we propose optimized primers for re-sequencing purposes. The database will be regularly updated with additional data from ongoing sequenced genomes. Our strategy has been validated by three independent tests: (i) ICDS prediction on a benchmark of artificially created frameshifts, (ii) comparison of predicted ICDS and results obtained from the comparison of the two genomic sequences of Bacillus licheniformis strain ATCC 14580 and (iii) re-sequencing of 25 predicted ICDS of the recently sequenced genome of Mycobacterium smegmatis. This allows us to estimate the specificity and sensitivity (95 and 82%, respectively) of our program and the efficiency of primer determination

    Insights into metazoan evolution from Alvinella pompejana cDNAs.

    Get PDF
    International audienceBACKGROUND: Alvinella pompejana is a representative of Annelids, a key phylum for evo-devo studies that is still poorly studied at the sequence level. A. pompejana inhabits deep-sea hydrothermal vents and is currently known as one of the most thermotolerant Eukaryotes in marine environments, withstanding the largest known chemical and thermal ranges (from 5 to 105°C). This tube-dwelling worm forms dense colonies on the surface of hydrothermal chimneys and can withstand long periods of hypo/anoxia and long phases of exposure to hydrogen sulphides. A. pompejana specifically inhabits chimney walls of hydrothermal vents on the East Pacific Rise. To survive, Alvinella has developed numerous adaptations at the physiological and molecular levels, such as an increase in the thermostability of proteins and protein complexes. It represents an outstanding model organism for studying adaptation to harsh physicochemical conditions and for isolating stable macromolecules resistant to high temperatures. RESULTS: We have constructed four full length enriched cDNA libraries to investigate the biology and evolution of this intriguing animal. Analysis of more than 75,000 high quality reads led to the identification of 15,858 transcripts and 9,221 putative protein sequences. Our annotation reveals a good coverage of most animal pathways and networks with a prevalence of transcripts involved in oxidative stress resistance, detoxification, anti-bacterial defence, and heat shock protection. Alvinella proteins seem to show a slow evolutionary rate and a higher similarity with proteins from Vertebrates compared to proteins from Arthropods or Nematodes. Their composition shows enrichment in positively charged amino acids that might contribute to their thermostability. The gene content of Alvinella reveals that an important pool of genes previously considered to be specific to Deuterostomes were in fact already present in the last common ancestor of the Bilaterian animals, but have been secondarily lost in model invertebrates. This pool is enriched in glycoproteins that play a key role in intercellular communication, hormonal regulation and immunity. CONCLUSIONS: Our study starts to unravel the gene content and sequence evolution of a deep-sea annelid, revealing key features in eukaryote adaptation to extreme environmental conditions and highlighting the proximity of Annelids and Vertebrates

    Toward community standards in the quest for orthologs

    Get PDF
    The identification of orthologs—genes pairs descended from a common ancestor through speciation, rather than duplication—has emerged as an essential component of many bioinformatics applications, ranging from the annotation of new genomes to experimental target prioritization. Yet, the development and application of orthology inference methods is hampered by the lack of consensus on source proteomes, file formats and benchmarks. The second ‘Quest for Orthologs' meeting brought together stakeholders from various communities to address these challenges. We report on achievements and outcomes of this meeting, focusing on topics of particular relevance to the research community at large. The Quest for Orthologs consortium is an open community that welcomes contributions from all researchers interested in orthology research and applications. Contact: [email protected]

    Maladie d'Alzheimer et stress oxydant (implications et perspectives thérapeutiques)

    No full text
    RENNES1-BU Santé (352382103) / SudocSudocFranceF
    corecore