188 research outputs found

    Considering Transposable Element Diversification in De Novo Annotation Approaches

    Get PDF
    Transposable elements (TEs) are mobile, repetitive DNA sequences that are almost ubiquitous in prokaryotic and eukaryotic genomes. They have a large impact on genome structure, function and evolution. With the recent development of high-throughput sequencing methods, many genome sequences have become available, making possible comparative studies of TE dynamics at an unprecedented scale. Several methods have been proposed for the de novo identification of TEs in sequenced genomes. Most begin with the detection of genomic repeats, but the subsequent steps for defining TE families differ. High-quality TE annotations are available for the Drosophila melanogaster and Arabidopsis thaliana genome sequences, providing a solid basis for the benchmarking of such methods. We compared the performance of specific algorithms for the clustering of interspersed repeats and found that only a particular combination of algorithms detected TE families with good recovery of the reference sequences. We then applied a new procedure for reconciling the different clustering results and classifying TE sequences. The whole approach was implemented in a pipeline using the REPET package. Finally, we show that our combined approach highlights the dynamics of well defined TE families by making it possible to identify structural variations among their copies. This approach makes it possible to annotate TE families and to study their diversification in a single analysis, improving our understanding of TE dynamics at the whole-genome scale and for diverse species

    Reconstruction of ancestral chromosome architecture and gene repertoire reveals principles of genome evolution in a model yeast genus

    No full text
    International audienceReconstructing genome history is complex but necessary to reveal quantitative principles governing genome evolution. Such reconstruction requires recapitulating into a single evolutionary framework the evolution of genome architecture and gene repertoire. Here, we reconstructed the genome history of the genus Lachancea that appeared to cover a continuous evolutionary range from closely related to more diverged yeast species. Our approach integrated the generation of a high-quality genome data set; the development of AnChro, a new algorithm for reconstructing ancestral genome architecture; and a comprehensive analysis of gene repertoire evolution. We found that the ancestral genome of the genus Lachancea contained eight chromosomes and about 5173 protein-coding genes. Moreover, we characterized 24 horizontal gene transfers and 159 putative gene creation events that punctuated species diversification. We retraced all chromosomal rearrangements, including gene losses, gene duplications, chromosomal inversions and translocations at single gene resolution. Gene duplications outnumbered losses and balanced rearrangements with 1503, 929, and 423 events, respectively. Gene content variations between extant species are mainly driven by differential gene losses, while gene duplications remained globally constant in all lineages. Remarkably, we discovered that balanced chromosomal rearrangements could be responsible for up to 14% of all gene losses by disrupting genes at their breakpoints. Finally, we found that nonsynonymous substitutions reached fixation at a coordinated pace with chromosomal inversions, translocations, and duplications, but not deletions. Overall, we provide a granular view of genome evolution within an entire eukaryotic genus, linking gene content, chromosome rearrangements , and protein divergence into a single evolutionary framework

    Mode and tempo of gene and genome evolution in plants

    Get PDF

    Evolution of mammalian genome architecture through retrotransposition

    Get PDF
    Retrotransposons, mobile DNA elements that replicate via a copy and paste mechanism, are a major component of mammalian genome architecture. They account for at least one-third of the human genome and are major drivers of lineage-specific gain and loss of DNA. While there are many examples of how specific retrotransposons have impacted evolution, their interaction with large-scale genome architecture remains poorly characterised. Throughout my thesis I investigated two fundamental questions regarding genome evolution and retrotransposons. Firstly, how does genome architecture shape retrotransposon accumulation? Secondly, how does retrotransposon accumulation in turn impact on genome architecture? The current model of retrotransposon accumulation largely relies on local sequence composition. However, this model fails to account for genome-wide chromatin structure, an important factor that regulates DNA accessibility to insertion machinery. By analysing retrotransposon accumulation at open chromatin sites I showed that genome structure strongly associates with retrotransposon accumulation patterns. In addition, by mapping retrotransposon accumulation patterns of non-human mammals back to human, I was able to observe large-scale positional conservation of lineage-specific retrotransposons. These findings suggest that through conservation of synteny, gene regulation and nuclear organisation, retrotransposon accumulation in mammalian genomes follows similar evolutionary trajectories. Beneath the conserved structural framework of mammalian genomes there exists a high degree of lineage-specific turnover of DNA. Outside of whole genome duplication, retrotransposons are the largest contributing factor to genome growth. In contrast to this, accumulation of retrotransposons can also increase the probability of unequal crossing over causing DNA loss through large deletion events. Using multiple pairwise alignments I calculated regional levels of lineage-specific DNA gain and loss in the human and mouse genomes. I found that while lineage-specific DNA loss overlapped with open chromatin regions in both genomes, different sources for lineage-specific DNA gain drove divergence in genome architecture. These findings reveal the turbulent nature of lineage-specific evolution of large-scale genome architecture, ultimately questioning the evolutionary stability of structural chromosomal domains. In addition to analysing large-scale genome architecture I performed two separate analyses on retrotransposons in the bovine genome. Due to the presence of BovB retrotransposons, the bovine retrotransposon landscape is clearly distinct from other placental mammals. For the first analysis, I identified bovine-specific retrotransposon associated gene coexpression networks. Following the genomic distribution of bovine retrotransposons, my results show that gene expression strongly associates with genome architecture. For the second analysis, I characterised retrotransposons surrounding tandem duplicate copies of the bovine NK-lysin gene. My results were consistent with retrotransposon accumulation causing genomic rearrangements via non-allelic homologous recombination. Altogether, my thesis reveals hidden interactions between retrotransposon accumulation, and mammalian genome structure and function. By re-purposing publicly available datasets I have characterised various aspects of the complex co-evolutionary relationships between retrotransposons and the genomes in which they reside in.Thesis (Ph.D.) -- University of Adelaide, School of Biological Sciences, 201

    Striking structural dynamism and nucleotide sequence variation of the transposon Galileo in the genome of Drosophila mojavensis

    Get PDF
    Background: Galileo is a transposable element responsible for the generation of three chromosomal inversions in natural populations of Drosophila buzzatii. Although the most characteristic feature of Galileo is the long internally-repetitive terminal inverted repeats (TIRs), which resemble the Drosophila Foldback element, its transposase-coding sequence has led to its classification as a member of the P-element superfamily (Class II, subclass 1, TIR order). Furthermore, Galileo has a wide distribution in the genus Drosophila, since it has been found in 6 of the 12 Drosophila sequenced genomes. Among these species, D. mojavensis, the one closest to D. buzzatii, presented the highest diversity in sequence and structure of Galileo elements. Results: In the present work, we carried out a thorough search and annotation of all the Galileo copies present in the D. mojavensis sequenced genome. In our set of 170 Galileo copies we have detected 5 Galileo subfamilies (C, D, E, F, and X) with different structures ranging from nearly complete, to only 2 TIR or solo TIR copies. Finally, we have explored the structural and length variation of the Galileo copies that point out the relatively frequent rearrangements within and between Galileo elements. Different mechanisms responsible for these rearrangements are discussed. Conclusions: Although Galileo is a transposable element with an ancient history in the D. mojavensis genome, our data indicate a recent transpositional activity. Furthermore, the dynamism in sequence and structure, mainly affecting the TIRs, suggests an active exchange of sequences among the copies. This exchange could lead to new subfamilies of the transposon, which could be crucial for the long-term survival of the element in the genome

    How life changes itself: The Read–Write (RW) genome

    Full text link

    The sterlet sturgeon genome sequence and the mechanisms of segmental rediploidization.

    Get PDF
    Sturgeons seem to be frozen in time. The archaic characteristics of this ancient fish lineage place it in a key phylogenetic position at the base of the ~30,000 modern teleost fish species. Moreover, sturgeons are notoriously polyploid, providing unique opportunities to investigate the evolution of polyploid genomes. We assembled a high-quality chromosome-level reference genome for the sterlet, Acipenser ruthenus. Our analysis revealed a very low protein evolution rate that is at least as slow as in other deep branches of the vertebrate tree, such as that of the coelacanth. We uncovered a whole-genome duplication that occurred in the Jurassic, early in the evolution of the entire sturgeon lineage. Following this polyploidization, the rediploidization of the genome included the loss of whole chromosomes in a segmental deduplication process. While known adaptive processes helped conserve a high degree of structural and functional tetraploidy over more than 180 million years, the reduction of redundancy of the polyploid genome seems to have been remarkably random

    The Evolution and Adaptive Effects of Transposable Elements in Birds and Elapids

    Get PDF
    Transposable elements (TEs) are genetic sequences able to copy or move themselves across their host genome. As TEs move within their host they can act as a source of genetic novelty, and hence are often described as \drivers of evolution". This novelty includes contributing or altering regulatory and coding regions, and promoting non- allelic homologous recombination and, in turn, major structural rearrangements. In some cases, TEs can further contribute to genomic change by jumping between organisms in a process known as horizontal transposon transfer (HTT). HTT is the passing of TEs between organisms by means other than parent to offspring, and has been well described across vertebrates, with multiple events noted in both birds and squamates. Birds are the most diverse class of reptiles, encompassing over 10,000 species, however studies in TE evolution in birds have focused on single lineages. Early findings from the chicken genome led to the assumption that avian TEs are largely stable and inactive. More recent studies have similarly focused on single lineages of birds, revealing some variation in TE activity across birds. In contrast to birds, few studies have explored the evolution of TEs in squamates (lizards and snakes) at a class or family level, instead examining their evolution either across the order or comparing two long diverged species. As such, it is unknown whether patterns seen across all squamates occur at shorter time scales. At lower levels many squamate families are highly diverse, rapidly adapting to new environments and ecological niches. One such family is Hydrophiinae, a family of elapid snakes containing ~100 terrestrial snakes, ~60 marine sea snakes and 6 amphibious sea kraits. In this thesis I investigate the evolution of TEs in two diverse groups of rep- tiles: birds and Australo-Melanesian elapid snakes (Hydrophiinae). I provide the first comprehensive study of TE activity across all orders of birds, focusing on the dominant superfamily, Chicken Repeat 1 (CR1) retrotransposons. By performing comparative genomic analyses I have identified significant variation in the rate of TE expansion both between and within avian orders. Clades including parrots, kiwis and waterfowl show high diversity and large, recent expansions of CR1 retrotransposons, while in various ratites and songbirds CR1s have been near inactive for tens of millions of years. The rest of the chapters focus on the evolution of TEs in hydrophiines, finding repeated HTT events into marine hydrophiines from other marine organisms. TEs in hydrophiines that were acquired via HTT appear to have played a role in their adaptation to the marine environment, with insertions found throughout regulatory regions. In the sea kraits, one horizontally transferred TE has rapidly expanded to make up 8-12% of the sea krait genome in a timespan of just 15-25 million years, the fastest known expansion of TEs in amniotes following a HTT event. Together this thesis presents bioinformatic analyses of two diverse clades of rep- tiles, Aves and Hydrophiine, finding that to truly understand TEs, their evolution and the potential adaptive effects they can cause, we must examine life on both a broad and fine scale.Thesis (Ph.D.) -- University of Adelaide, School of Biological Sciences, 202

    Gene expansion shapes genome architecture in the human pathogen Lichtheimia corymbifera: an evolutionary genomics analysis in the ancient terrestrial mucorales (Mucoromycotina)

    Get PDF
    Lichtheimia species are the second most important cause of mucormycosis in Europe. To provide broader insights into the molecular basis of the pathogenicity-associated traits of the basal Mucorales, we report the full genome sequence of L. corymbifera and compared it to the genome of Rhizopus oryzae, the most common cause of mucormycosis worldwide. The genome assembly encompasses 33.6 MB and 12,379 protein-coding genes. This study reveals four major differences of the L. corymbifera genome to R. oryzae: (i) the presence of an highly elevated number of gene duplications which are unlike R. oryzae not due to whole genome duplication (WGD), (ii) despite the relatively high incidence of introns, alternative splicing (AS) is not frequently observed for the generation of paralogs and in response to stress, (iii) the content of repetitive elements is strikingly low (<5%), (iv) L. corymbifera is typically haploid. Novel virulence factors were identified which may be involved in the regulation of the adaptation to iron-limitation, e.g. LCor01340.1 encoding a putative siderophore transporter and LCor00410.1 involved in the siderophore metabolism. Genes encoding the transcription factors LCor08192.1 and LCor01236.1, which are similar to GATA type regulators and to calcineurin regulated CRZ1, respectively, indicating an involvement of the calcineurin pathway in the adaption to iron limitation. Genes encoding MADS-box transcription factors are elevated up to 11 copies compared to the 1–4 copies usually found in other fungi. More findings are: (i) lower content of tRNAs, but unique codons inL. corymbifera, (ii) Over 25% of the proteins are apparently specific for L. corymbifera. (iii) L. corymbifera contains only 2/3 of the proteases (known to be essential virulence factors) in comparision to R. oryzae. On the other hand, the number of secreted proteases, however, is roughly twice as high as in R. oryzae
    • 

    corecore