15 research outputs found

    AnchorWave: Sensitive alignment of genomes with high sequence diversity, extensive structural polymorphism, and whole-genome duplication

    Get PDF
    Millions of species are currently being sequenced, and their genomes are being compared. Many of them have more complex genomes than model systems and raise novel challenges for genome alignment. Widely used local alignment strategies often produce limited or incongruous results when applied to genomes with dispersed repeats, long indels, and highly diverse sequences. Moreover, alignment using many-to-many or reciprocal best hit approaches conflicts with well-studied patterns between species with different rounds of whole-genome duplication. Here, we introduce Anchored Wavefront alignment (AnchorWave), which performs whole-genome duplication–informed collinear anchor identification between genomes and performs base pair–resolved global alignment for collinear blocks using a two-piece affine gap cost strategy. This strategy enables AnchorWave to precisely identify multikilobase indels generated by transposable element (TE) presence/absence variants (PAVs). When aligning two maize genomes, AnchorWave successfully recalled 87% of previously reported TE PAVs. By contrast, other genome alignment tools showed low power for TE PAV recall. AnchorWave precisely aligns up to three times more of the genome as position matches or indels than the closest competitive approach when comparing diverse genomes. Moreover, AnchorWave recalls transcription factor–binding sites at a rate of 1.05- to 74.85-fold higher than other tools with significantly lower false-positive alignments. AnchorWave complements available genome alignment tools by showing obvious improvement when applied to genomes with dispersed repeats, active TEs, high sequence diversity, and whole-genome duplication variation.This project is supported by the United States Department of Agriculture Agricultural Research Service, NSF No. 1822330, NSF No. 1854828, the European Union's Horizon 2020 Framework Programme under the DeepHealth project [825111], the European Union Regional Development Fund within the framework of The European Regional Development Fund Operational Program of Catalonia 2014 to 2020 with a grant of 50% of total cost eligible under the DRAC project [001-P-001723], and National Natural Science Foundation of China No. 31900486. M.C.S. was supported by NSF Postdoctoral Research Fellowship in Biology No. 1907343. M.M. was partially supported by the Spanish Ministry of Economy, Industry, and Competitiveness under Ramón y Cajal (RYC) fellowship number RYC-2016-21104.Peer ReviewedPostprint (published version

    The genomic ecosystem of transposable elements in maize

    No full text
    Transposable elements (TEs) constitute the majority of flowering plant DNA, reflecting their tremendous success in subverting, avoiding, and surviving the defenses of their host genomes to ensure their selfish replication. More than 85% of the sequence of the maize genome can be ascribed to past transposition, providing a major contribution to the structure of the genome. Evidence from individual loci has informed our understanding of how transposition has shaped the genome, and a number of individual TE insertions have been causally linked to dramatic phenotypic changes. Genome-wide analyses in maize and other taxa have frequently represented TEs as a relatively homogeneous class of fragmentary relics of past transposition, obscuring their evolutionary history and interaction with their host genome. Using an updated annotation of structurally intact TEs in the maize reference genome, we investigate the family-level dynamics of TEs in maize. Integrating a variety of data, from descriptors of individual TEs like coding capacity, expression, and methylation, as well as similar features of the sequence they inserted into, we model the relationship between attributes of the genomic environment and the survival of TE copies and families. In contrast to the wholesale relegation of all TEs to a single category of junk DNA, these differences reveal a diversity of survival strategies of TE families. Together these generate a rich ecology of the genome, with each TE family representing the evolution of a distinct ecological niche. We conclude that while the impact of transposition is highly family- and context-dependent, a family-level understanding of the ecology of TEs in the genome can refine our ability to predict the role of TEs in generating genetic and phenotypic diversity

    Monitoring the interplay between transposable element families and DNA methylation in maize

    No full text
    DNA methylation and epigenetic silencing play important roles in the regulation of transposable elements (TEs) in many eukaryotic genomes. A majority of the maize genome is derived from TEs that can be classified into different orders and families based on their mechanism of transposition and sequence similarity, respectively. TEs themselves are highly methylated and it can be tempting to view them as a single uniform group. However, the analysis of DNA methylation profiles in flanking regions provides evidence for distinct groups of chromatin properties at different TE families. These differences among TE families are reproducible in different tissues and different inbred lines. TE families with varying levels of DNA methylation in flanking regions also show distinct patterns of chromatin accessibility and modifications within the TEs. The differences in the patterns of DNA methylation flanking TE families arise from a combination of non-random insertion preferences of TE families, changes in DNA methylation triggered by the insertion of the TE and subsequent selection pressure. A set of nearly 70,000 TE polymorphisms among four assembled maize genomes were used to monitor the level of DNA methylation at haplotypes with and without the TE insertions. In many cases, TE families with high levels of DNA methylation in flanking sequence are enriched for insertions into highly methylated regions. The majority of the >2,500 TE insertions into unmethylated regions result in changes in DNA methylation in haplotypes with the TE, suggesting the widespread potential for TE insertions to condition altered methylation in conserved regions of the genome. This study highlights the interplay between TEs and the methylome of a major crop species