12 research outputs found

    EuGene: An Automated Integrative Gene Finder for Eukaryotes and Prokaryotes

    No full text
    International audienceEuGene is an integrative gene finder applicable to both prokaryotic and eukaryotic genomes. EuGene annotated its first genome in 1999. Starting from genomic DNA sequences representing a complete genome, EuGene is able to predict the major transcript units in the genome from a variety of sources of information: statistical information, similarities with known transcripts and proteins, but also any GFF3 structured information supporting the presence or absence of specific types of elements. EuGene has been used to find genes in the plants Arabidopsis thaliana, Medicago truncatula, and Theobroma cacao; tomato, sunflower, and Rosa genomes; and in the nematode Meloidogyne incognita genome, among many others. The large fraction of plant in this list probably influenced EuGene development, especially in its capacities to withstand a genome with a large number of repeated regions and transposable elements.Depending on the sources of information used for prediction, EuGene can be considered as purely ab initio, purely similarity based, or hybrid. With the general availability of NGS-transcribed sequence data in genome projects, EuGene adopts a default hybrid behavior that strongly relies on similarity information. Initially targeted at eukaryotic genomes, EuGene has also been extended to offer integrative gene prediction for bacteria, allowing for richer and robust predictions than either purely statistical or homology-based prokaryotic gene finders.This text has been written as a practical guide that will give you the capacity to train and execute EuGene on your favorite eukaryotic genome. As the prokaryotic case is simpler and has already been described, only the main differences with the eukaryotic version were reported

    Vertebrate Genome Size and the Impact of Transposable Elements in Genome Evolution

    No full text
    In eukaryotes, the haploid DNA content (C-value) varies widely across lineages without an apparent correlation with the complexity of organisms. This incongruity has been called the C-value paradox and has been solved by demonstrating that not all DNA is constituted by genes but, on the contrary,most of it ismade up of repetitive DNA. In vertebrates, the increasing number of sequenced genomes has shown that differences in genome size between lineages are ascribable to a variation in transposon content. These mobile elements, previously perceived as “junk DNA” or “selfish DNA,” are now recognized as the major players in shaping genomes. During vertebrate evolution, transposable elements have been repeatedly co-opted and exapted to generate regulatory sequences, coding exons, or entirely new genes that lead to evolutionary advantages for the host. Moreover, transposable elements are also responsible for substantial rearrangements such as insertions, deletions, inversions, and duplications potentially associated with, or following, speciation events
    corecore