911 research outputs found

    The mitochondrial genome of Sinentomon erythranum (Arthropoda: Hexapoda: Protura): an example of highly divergent evolution

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The phylogenetic position of the Protura, traditionally considered the most basal hexapod group, is disputed because it has many unique morphological characters compared with other hexapods. Although mitochondrial genome information has been used extensively in phylogenetic studies, such information is not available for the Protura. This has impeded phylogenetic studies on this taxon, as well as the evolution of the arthropod mitochondrial genome.</p> <p>Results</p> <p>In this study, the mitochondrial genome of <it>Sinentomon erythranum </it>was sequenced, as the first proturan species to be reported. The genome contains a number of special features that differ from those of other hexapods and arthropods. As a very small arthropod mitochondrial genome, its 14,491 nucleotides encode 37 typical mitochondrial genes. Compared with other metazoan mtDNA, it has the most biased nucleotide composition with T = 52.4%, an extreme and reversed AT-skew of -0.351 and a GC-skew of 0.350. Two tandemly repeated regions occur in the A+T-rich region, and both could form stable stem-loop structures. Eighteen of the 22 tRNAs are greatly reduced in size with truncated secondary structures. The gene order is novel among available arthropod mitochondrial genomes. Rearrangements have involved in not only small tRNA genes, but also PCGs (protein-coding genes) and ribosome RNA genes. A large block of genes has experienced inversion and another nearby block has been reshuffled, which can be explained by the tandem duplication and random loss model. The most remarkable finding is that <it>trnL2(UUR) </it>is not located between <it>cox1 </it>and <it>cox2 </it>as observed in most hexapod and crustacean groups, but is between <it>rrnL </it>and <it>nad1 </it>as in the ancestral arthropod ground pattern. The "<it>cox1</it>-<it>cox2</it>" pattern was further confirmed in three more representative proturan species. The phylogenetic analyses based on the amino acid sequences of 13 mitochondrial PCGs suggest <it>S</it>. <it>erythranum </it>failed to group with other hexapod groups.</p> <p>Conclusions</p> <p>The mitochondrial genome of <it>S. erythranum </it>shows many different features from other hexapod and arthropod mitochondrial genomes. It underwent highly divergent evolution. The "<it>cox1</it>-<it>cox2</it>" pattern probably represents the ancestral state for all proturan mitogenomes, and suggests a long evolutionary history for the Protura.</p

    Plastid trnF pseudogenes are present in Jaltomata, the sister genus of Solanum (Solanaceae) : molecular evolution of tandemly repeated structural mutations

    Get PDF
    Extensive gene duplication arranged in a tandem array is rare in the plastome of embryophytes. Interestingly, we found pseudogene copies of the trnF gene in the genus Jaltomata, the sister genus of Solanum where such gene duplication has been previously reported. In each Jaltomata sequence available we found two pseudogene copies in close 5’-proximity to the original functional gene. The size of each pseudogene copy ranged between 17 and 48 bp and the anticodon domain was identified as the most conserved element. A common ATT(G)n motif is particularly interesting and its modifications were found to border the 3’ of the duplicated regions. Other motifs were partial residues, or entire parts of the T- and D-domains, and both domains proved to be variable in length among the pseudogenes identified. The residues of the 3’ and 5’ acceptor stem were not found among the copies. We further compared the newly discovered copies of Jaltomata with those ones previously described from Solanum and inferred phylogenetic relationships of the copies aligned. The evolution of Solanum copies, in contrast to Jaltomata, is hard to explain as resulting only in parsimonious changes since reticulate evolutionary patterns were detected among the copies. The dynamic evolutionary patterns of Solanum might be explained by possible inter- or intrachromosomal recombination.Peer reviewe

    Analysis of the largest tandemly repeated DNA families in the human genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Tandemly Repeated DNA represents a large portion of the human genome, and accounts for a significant amount of copy number variation. Here we present a genome wide analysis of the largest tandem repeats found in the human genome sequence.</p> <p>Results</p> <p>Using Tandem Repeats Finder (TRF), tandem repeat arrays greater than 10 kb in total size were identified, and classified into simple sequence e.g. GAATG, classical satellites e.g. alpha satellite DNA, and locus specific VNTR arrays. Analysis of these large sequenced regions revealed that several "simple sequence" arrays actually showed complex domain and/or higher order repeat organization. Using additional methods, we further identified a total of 96 additional arrays with tandem repeat units greater than 2 kb (the detection limit of TRF), 53 of which contained genes or repeated exons. The overall size of an array of tandem 12 kb repeats which spanned a gap on chromosome 8 was found to be 600 kb to 1.7 Mbp in size, representing one of the largest non-centromeric arrays characterized. Several novel megasatellite tandem DNA families were observed that are characterized by repeating patterns of interspersed transposable elements that have expanded presumably by unequal crossing over. One of these families is found on 11 different chromosomes in >25 arrays, and represents one of the largest most widespread megasatellite DNA families.</p> <p>Conclusion</p> <p>This study represents the most comprehensive genome wide analysis of large tandem repeats in the human genome, and will serve as an important resource towards understanding the organization and copy number variation of these complex DNA families.</p

    Efficient algorithms for analyzing segmental duplications with deletions and inversions in genomes

    Get PDF
    Background: Segmental duplications, or low-copy repeats, are common in mammalian genomes. In the human genome, most segmental duplications are mosaics comprised of multiple duplicated fragments. This complex genomic organization complicates analysis of the evolutionary history of these sequences. One model proposed to explain this mosaic patterns is a model of repeated aggregation and subsequent duplication of genomic sequences. Results: We describe a polynomial-time exact algorithm to compute duplication distance, a genomic distance defined as the most parsimonious way to build a target string by repeatedly copying substrings of a fixed source string. This distance models the process of repeated aggregation and duplication. We also describe extensions of this distance to include certain types of substring deletions and inversions. Finally, we provide an description of a sequence of duplication events as a context-free grammar (CFG). Conclusion: These new genomic distances will permit more biologically realistic analyses of segmental duplications in genomes.

    Approches algorithmiques pour l’inférence d’histoires de duplication en tandem avec inversions et délétions pour des familles multigéniques

    Full text link
    [Français] Une fraction importante des génomes eucaryotes est constituée de Gènes Répétés en Tandem (GRT). Un mécanisme fondamental dans l’évolution des GRT est la recombinaison inégale durant la méiose, entrainant la duplication locale (en tandem) de segments chromosomiques contenant un ou plusieurs gènes adjacents. Différents algorithmes ont été proposés pour inférer une histoire de duplication en tandem pour un cluster de GRT. Cependant, leur utilisation est limitée dans la pratique, car ils ne tiennent pas compte d’autres événements évolutifs pourtant fréquents, comme les inversions, les duplications inversées et les délétions. Cette thèse propose différentes approches algorithmiques permettant d’intégrer ces événements dans le modèle de duplication en tandem classique. Nos contributions sont les suivantes: • Intégrer les inversions dans un modèle de duplication en tandem simple (duplication d’un gène à la fois) et proposer un algorithme exact permettant de calculer le nombre minimal d’inversions s’étant produites dans l’évolution d’un cluster de GRT. • Généraliser ce modèle pour l’étude d’un ensemble de clusters orthologues dans plusieurs espèces. • Proposer un algorithme permettant d’inférer l’histoire évolutive d’un cluster de GRT en tenant compte des duplications en tandem, duplications inversées, inversions et délétions de segments chromosomiques contenant un ou plusieurs gènes adjacents.[English] Tandemly arrayed genes (TAGs) represent an important fraction of most genomes. A fundamental mechanism at the origin of TAG clusters is unequal crossing-over during meiosis, leading to the duplication of chromosomal segments containing one or many adjacent genes. Such duplications are called tandem duplications, as the duplicated segment is placed next to the original one on the chromosome. Different algorithms have been proposed to infer the tandem duplication history of a TAG cluster. However, their applicability is limited in practice since they do not take into account other frequent evolutionary events such as inversion, inverted duplication and deletion. In this thesis, we propose different algorithmic approaches allowing to integrate these evolutionary events in the original tandem duplication model of evolution. Our contributions are summarized as follows: • We integrate inversion events in a tandem duplication model restricted to single gene duplications, and we propose an exact algorithm allowing to compute the minimum number of inversions explaining the evolution of a TAG cluster. • We generalize this model to the study of orthologous TAG clusters in different species. • We propose an algorithm allowing to infer the evolutionary history of a TAG cluster through tandem duplication, inverted duplication, inversion and deletion of chromosomal segments containing one or many adjacent genes

    Evolution of ribosomal DNA-derived satellite repeat in tomato genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Tandemly repeated DNA, also called as satellite DNA, is a common feature of eukaryotic genomes. Satellite repeats can expand and contract dramatically, which may cause genome size variation among genetically-related species. However, the origin and expansion mechanism are not clear yet and needed to be elucidated.</p> <p>Results</p> <p>FISH analysis revealed that the satellite repeat showing homology with intergenic spacer (IGS) of rDNA present in the tomato genome. By comparing the sequences representing distinct stages in the divergence of rDNA repeat with those of canonical rDNA arrays, the molecular mechanism of the evolution of satellite repeat is described. Comprehensive sequence analysis and phylogenetic analysis demonstrated that a long terminal repeat retrotransposon was interrupted into each copy of the 18S rDNA and polymerized by recombination rather than transposition via an RNA intermediate. The repeat was expanded through doubling the number of IGS into the 25S rRNA gene, and also greatly increasing the copy number of type I subrepeat in the IGS of 25-18S rDNA by segmental duplication. Homogenization to a single type of subrepeat in the satellite repeat was achieved as the result of amplifying copy number of the type I subrepeat but eliminating neighboring sequences including the type II subrepeat and rRNA coding sequence from the array. FISH analysis revealed that the satellite repeats are commonly present in closely-related <it>Solanum </it>species, but vary in their distribution and abundance among species.</p> <p>Conclusion</p> <p>These results represent that the dynamic satellite repeats were originated from intergenic spacer of rDNA unit in the tomato genome. This result could serve as an example towards understanding the initiation and the expansion of the satellite repeats in complex eukaryotic genome.</p

    Einkorn genomics sheds light on history of the oldest domesticated wheat

    Full text link
    Einkorn (Triticum monococcum) was the first domesticated wheat species, and was central to the birth of agriculture and the Neolithic Revolution in the Fertile Crescent around 10,000 years ago1,2^{1,2}. Here we generate and analyse 5.2-Gb genome assemblies for wild and domesticated einkorn, including completely assembled centromeres. Einkorn centromeres are highly dynamic, showing evidence of ancient and recent centromere shifts caused by structural rearrangements. Whole-genome sequencing analysis of a diversity panel uncovered the population structure and evolutionary history of einkorn, revealing complex patterns of hybridizations and introgressions after the dispersal of domesticated einkorn from the Fertile Crescent. We also show that around 1% of the modern bread wheat (Triticum aestivum) A subgenome originates from einkorn. These resources and findings highlight the history of einkorn evolution and provide a basis to accelerate the genomics-assisted improvement of einkorn and bread wheat

    Tandemly Arrayed Genes in Vertebrate Genomes

    Get PDF
    Tandemly arrayed genes (TAGs) are duplicated genes that are linked as neighbors on a chromosome, many of which have important physiological and biochemical functions. Here we performed a survey of these genes in 11 available vertebrate genomes. TAGs account for an average of about 14% of all genes in these vertebrate genomes, and about 25% of all duplications. The majority of TAGs (72–94%) have parallel transcription orientation (i.e., they are encoded on the same strand) in contrast to the genome, which has about 50% of its genes in parallel transcription orientation. The majority of tandem arrays have only two members. In all species, the proportion of genes that belong to TAGs tends to be higher in large gene families than in small ones; together with our recent finding that tandem duplication played a more important role than retroposition in large families, this fact suggests that among all types of duplication mechanisms, tandem duplication is the predominant mechanism of duplication, especially in large families. Finally, several species have a higher proportion of large tandem arrays that are species-specific than random expectation
    corecore