10 research outputs found
Recommended from our members
Identification of the expressome by machine learning on omics data.
Accurate annotation of plant genomes remains complex due to the presence of many pseudogenes arising from whole-genome duplication-generated redundancy or the capture and movement of gene fragments by transposable elements. Machine learning on genome-wide epigenetic marks, informed by transcriptomic and proteomic training data, could be used to improve annotations through classification of all putative protein-coding genes as either constitutively silent or able to be expressed. Expressed genes were subclassified as able to express both mRNAs and proteins or only RNAs, and CG gene body methylation was associated only with the former subclass. More than 60,000 protein-coding genes have been annotated in the reference genome of maize inbred B73. About two-thirds of these genes are transcribed and are designated the filtered gene set (FGS). Classification of genes by our trained random forest algorithm was accurate and relied only on histone modifications or DNA methylation patterns within the gene body; promoter methylation was unimportant. Other inbred lines are known to transcribe significantly different sets of genes, indicating that the FGS is specific to B73. We accurately classified the sets of transcribed genes in additional inbred lines, arising from inbred-specific DNA methylation patterns. This approach highlights the potential of using chromatin information to improve annotations of functional genes
Recommended from our members
Hybrid Decay: A Transgenerational Epigenetic Decline in Vigor and Viability Triggered in Backcross Populations of Teosinte with Maize.
In the course of generating populations of maize with teosinte chromosomal introgressions, an unusual sickly plant phenotype was noted in individuals from crosses with two teosinte accessions collected near Valle de Bravo, Mexico. The plants of these Bravo teosinte accessions appear phenotypically normal themselves and the F1 plants appear similar to typical maize × teosinte F1s. However, upon backcrossing to maize, the BC1 and subsequent generations display a number of detrimental characteristics including shorter stature, reduced seed set, and abnormal floral structures. This phenomenon is observed in all BC individuals and there is no chromosomal segment linked to the sickly plant phenotype in advanced backcross generations. Once the sickly phenotype appears in a lineage, normal plants are never again recovered by continued backcrossing to the normal maize parent. Whole-genome shotgun sequencing reveals a small number of genomic sequences, some with homology to transposable elements, that have increased in copy number in the backcross populations. Transcriptome analysis of seedlings, which do not have striking phenotypic abnormalities, identified segments of 18 maize genes that exhibit increased expression in sickly plants. A de novo assembly of transcripts present in plants exhibiting the sickly phenotype identified a set of 59 upregulated novel transcripts. These transcripts include some examples with sequence similarity to transposable elements and other sequences present in the recurrent maize parent (W22) genome as well as novel sequences not present in the W22 genome. Genome-wide profiles of gene expression, DNA methylation, and small RNAs are similar between sickly plants and normal controls, although a few upregulated transcripts and transposable elements are associated with altered small RNA or methylation profiles. This study documents hybrid incompatibility and genome instability triggered by the backcrossing of Bravo teosinte with maize. We name this phenomenon "hybrid decay" and present ideas on the mechanism that may underlie it
Opportunities to Use DNA methylation to distil functional elements in large crop genomes
Recommended from our members
Identification of the expressome by machine learning on omics data.
Accurate annotation of plant genomes remains complex due to the presence of many pseudogenes arising from whole-genome duplication-generated redundancy or the capture and movement of gene fragments by transposable elements. Machine learning on genome-wide epigenetic marks, informed by transcriptomic and proteomic training data, could be used to improve annotations through classification of all putative protein-coding genes as either constitutively silent or able to be expressed. Expressed genes were subclassified as able to express both mRNAs and proteins or only RNAs, and CG gene body methylation was associated only with the former subclass. More than 60,000 protein-coding genes have been annotated in the reference genome of maize inbred B73. About two-thirds of these genes are transcribed and are designated the filtered gene set (FGS). Classification of genes by our trained random forest algorithm was accurate and relied only on histone modifications or DNA methylation patterns within the gene body; promoter methylation was unimportant. Other inbred lines are known to transcribe significantly different sets of genes, indicating that the FGS is specific to B73. We accurately classified the sets of transcribed genes in additional inbred lines, arising from inbred-specific DNA methylation patterns. This approach highlights the potential of using chromatin information to improve annotations of functional genes
Stable unmethylated DNA demarcates expressed genes and their cis-regulatory space in plant genomes
The genomic sequences of crops continue to be produced at a frenetic pace. It remains challenging to develop complete annotations of functional genes and regulatory elements in these genomes. Chromatin accessibility assays enable discovery of functional elements; however, to uncover the full portfolio of cis-elements would require profiling of many combinations of cell types, tissues, developmental stages, and environments. Here, we explore the potential to use DNA methylation profiles to develop more complete annotations. Using leaf tissue in maize, we define ∼100,000 unmethylated regions (UMRs) that account for 5.8% of the genome; 33,375 UMRs are found greater than 2 kb from genes. UMRs are highly stable in multiple vegetative tissues, and they capture the vast majority of accessible chromatin regions from leaf tissue. However, many UMRs are not accessible in leaf, and these represent regions with potential to become accessible in specific cell types or developmental stages. These UMRs often occur near genes that are expressed in other tissues and are enriched for binding sites of transcription factors. The leaf-inaccessible UMRs exhibit unique chromatin modification patterns and are enriched for chromatin interactions with nearby genes. The total UMR space in four additional monocots ranges from 80 to 120 megabases, which is remarkably similar considering the range in genome size of 271 megabases to 4.8 gigabases. In summary, based on the profile from a single tissue, DNA methylation signatures provide powerful filters to distill large genomes down to the small fraction of putative functional genes and regulatory elements
Maize 509 line TE PAV calls
Transposable elements (TEs) have the potential to create regulatory variation both through disruption of existing DNA regulatory elements and through creation of novel DNA regulatory elements. In a species with a large genome, such as maize, the many TEs interspersed with genes creates opportunities for significant allelic variation due to TE presence/absence polymorphisms among individuals. We used information on putative regulatory elements in combination with knowledge about TE polymorphisms in maize to identify TE insertions that interrupt existing accessible chromatin regions (ACRs) in B73 as well as examples of polymorphic TEs that contain ACRs among four inbred lines of maize including B73, Mo17, W22, and PH207. The TE insertions in three other assembled maize genomes (Mo17, W22 or PH207) that interrupt ACRs that are present in the B73 genome can trigger changes to the chromatin suggesting the potential for both genetic and epigenetic influences of these insertions. Nearly 20% of the ACRs located over 2kb from the nearest gene are located within an annotated TE. These are regions of unmethylated DNA that show evidence for functional importance similar to ACRs that are not present within TEs. Using a large panel of maize genotypes we tested if there is an association between the presence of TE insertions that interrupt, or carry, an ACR and the expression of nearby genes. While most TE polymorphisms are not associated with expression for nearby genes the TEs that carry ACRs exhibit an enrichment for being associated with higher expression of nearby genes, suggesting that these TEs may contribute novel regulatory elements. These analyses highlight the potential for a subset of TEs to rewire transcriptional responses in eukaryotic genomes
Meta Gene Regulatory Networks in Maize Highlight Functionally Relevant Regulatory Interactions
Monitoring the interplay between transposable element families and DNA methylation in maize
DNA methylation and epigenetic silencing play important roles in the regulation of transposable elements (TEs) in many eukaryotic genomes. A majority of the maize genome is derived from TEs that can be classified into different orders and families based on their mechanism of transposition and sequence similarity, respectively. TEs themselves are highly methylated and it can be tempting to view them as a single uniform group. However, the analysis of DNA methylation profiles in flanking regions provides evidence for distinct groups of chromatin properties at different TE families. These differences among TE families are reproducible in different tissues and different inbred lines. TE families with varying levels of DNA methylation in flanking regions also show distinct patterns of chromatin accessibility and modifications within the TEs. The differences in the patterns of DNA methylation flanking TE families arise from a combination of non-random insertion preferences of TE families, changes in DNA methylation triggered by the insertion of the TE and subsequent selection pressure. A set of nearly 70,000 TE polymorphisms among four assembled maize genomes were used to monitor the level of DNA methylation at haplotypes with and without the TE insertions. In many cases, TE families with high levels of DNA methylation in flanking sequence are enriched for insertions into highly methylated regions. The majority of the >2,500 TE insertions into unmethylated regions result in changes in DNA methylation in haplotypes with the TE, suggesting the widespread potential for TE insertions to condition altered methylation in conserved regions of the genome. This study highlights the interplay between TEs and the methylome of a major crop species
Recommended from our members
Hybrid Decay: A Transgenerational Epigenetic Decline in Vigor and Viability Triggered in Backcross Populations of Teosinte with Maize.
In the course of generating populations of maize with teosinte chromosomal introgressions, an unusual sickly plant phenotype was noted in individuals from crosses with two teosinte accessions collected near Valle de Bravo, Mexico. The plants of these Bravo teosinte accessions appear phenotypically normal themselves and the F1 plants appear similar to typical maize × teosinte F1s. However, upon backcrossing to maize, the BC1 and subsequent generations display a number of detrimental characteristics including shorter stature, reduced seed set, and abnormal floral structures. This phenomenon is observed in all BC individuals and there is no chromosomal segment linked to the sickly plant phenotype in advanced backcross generations. Once the sickly phenotype appears in a lineage, normal plants are never again recovered by continued backcrossing to the normal maize parent. Whole-genome shotgun sequencing reveals a small number of genomic sequences, some with homology to transposable elements, that have increased in copy number in the backcross populations. Transcriptome analysis of seedlings, which do not have striking phenotypic abnormalities, identified segments of 18 maize genes that exhibit increased expression in sickly plants. A de novo assembly of transcripts present in plants exhibiting the sickly phenotype identified a set of 59 upregulated novel transcripts. These transcripts include some examples with sequence similarity to transposable elements and other sequences present in the recurrent maize parent (W22) genome as well as novel sequences not present in the W22 genome. Genome-wide profiles of gene expression, DNA methylation, and small RNAs are similar between sickly plants and normal controls, although a few upregulated transcripts and transposable elements are associated with altered small RNA or methylation profiles. This study documents hybrid incompatibility and genome instability triggered by the backcrossing of Bravo teosinte with maize. We name this phenomenon "hybrid decay" and present ideas on the mechanism that may underlie it