1,613 research outputs found

    Updating RNA-Seq analyses after re-annotation

    Get PDF
    The estimation of isoform abundances from RNA-Seq data requires a time-intensive step of mapping reads to either an assembled or previously annotated transcriptome, followed by an optimization procedure for deconvolution of multi-mapping reads. These procedures are essential for downstream analysis such as differential expression. In cases where it is desirable to adjust the underlying annotation, for example, on the discovery of novel isoforms or errors in existing annotations, current pipelines must be rerun from scratch. This makes it difficult to update abundance estimates after re-annotation, or to explore the effect of changes in the transcriptome on analyses. We present a novel efficient algorithm for updating abundance estimates from RNA-Seq experiments on re-annotation that does not require re-analysis of the entire dataset. Our approach is based on a fast partitioning algorithm for identifying transcripts whose abundances may depend on the added or deleted isoforms, and on a fast follow-up approach to re-estimating abundances for all transcripts. We demonstrate the effectiveness of our methods by showing how to synchronize RNA-Seq abundance estimates with the daily RefSeq incremental updates. Thus, we provide a practical approach to maintaining relevant databases of RNA-Seq derived abundance estimates even as annotations are being constantly revised

    Polymorphism identification and improved genome annotation of Brassica rapa through Deep RNA sequencing.

    Get PDF
    The mapping and functional analysis of quantitative traits in Brassica rapa can be greatly improved with the availability of physically positioned, gene-based genetic markers and accurate genome annotation. In this study, deep transcriptome RNA sequencing (RNA-Seq) of Brassica rapa was undertaken with two objectives: SNP detection and improved transcriptome annotation. We performed SNP detection on two varieties that are parents of a mapping population to aid in development of a marker system for this population and subsequent development of high-resolution genetic map. An improved Brassica rapa transcriptome was constructed to detect novel transcripts and to improve the current genome annotation. This is useful for accurate mRNA abundance and detection of expression QTL (eQTLs) in mapping populations. Deep RNA-Seq of two Brassica rapa genotypes-R500 (var. trilocularis, Yellow Sarson) and IMB211 (a rapid cycling variety)-using eight different tissues (root, internode, leaf, petiole, apical meristem, floral meristem, silique, and seedling) grown across three different environments (growth chamber, greenhouse and field) and under two different treatments (simulated sun and simulated shade) generated 2.3 billion high-quality Illumina reads. A total of 330,995 SNPs were identified in transcribed regions between the two genotypes with an average frequency of one SNP in every 200 bases. The deep RNA-Seq reassembled Brassica rapa transcriptome identified 44,239 protein-coding genes. Compared with current gene models of B. rapa, we detected 3537 novel transcripts, 23,754 gene models had structural modifications, and 3655 annotated proteins changed. Gaps in the current genome assembly of B. rapa are highlighted by our identification of 780 unmapped transcripts. All the SNPs, annotations, and predicted transcripts can be viewed at http://phytonetworks.ucdavis.edu/

    Transcriptome dynamics in the asexual cycle of the chordate Botryllus schlosseri

    Get PDF
    Background: We performed an analysis of the transcriptome during the blastogenesis of the chordate Botryllus schlosseri, focusing in particular on genes involved in cell death by apoptosis. The tunicate B. schlosseri is an ascidian forming colonies characterized by the coexistence of three blastogenetic generations: filter-feeding adults, buds on adults, and budlets on buds. Cyclically, adult tissues undergo apoptosis and are progressively resorbed and replaced by their buds originated by asexual reproduction. This is a feature of colonial tunicates, the only known chordates that can reproduce asexually. Results: Thanks to a newly developed web-based platform (http://botryllus.cribi.unipd.it), we compared the transcriptomes of the mid-cycle, the pre-take-over, and the take-over phases of the colonial blastogenetic cycle. The platform is equipped with programs for comparative analysis and allows to select the statistical stringency. We enriched the genome annotation with 11,337 new genes; 581 transcripts were resolved as complete open reading frames, translated in silico into amino acid sequences and then aligned onto the non-redundant sequence database. Significant differentially expressed genes were classified within the gene ontology categories. Among them, we recognized genes involved in apoptosis activation, de-activation, and regulation. Conclusions: With the current work, we contributed to the improvement of the first released B. schlosseri genome assembly and offer an overview of the transcriptome changes during the blastogenetic cycle, showing up- and down-regulated genes. These results are important for the comprehension of the events underlying colony growth and regression, cell proliferation, colony homeostasis, and competition among different generations

    Doctor of Philosophy

    Get PDF
    dissertationThe MAKER genome annotation and curation software tool was developed in response to increased demand for genome annotation services, secondary to decreased genome sequencing costs. MAKER currently has over 1000 registered users throughout the world. This wide adoption of MAKER has uncovered the need for additional functionalities. Here I addressed moving MAKER into the domain of plant annotation, expanding MAKER to include new methods of gene and noncoding RNA annotation, and improving usability of MAKER through documentation and community outreach. To move MAKER into the plant annotation domain, I benchmarked MAKER on the well-annotated Arabidopsis thaliana genome. MAKER performs well on the Arabidopsis genome in de novo genome annotation and was able to improve the current TAIR10 gene models by incorporating mRNA-seq data not available during the original annotation efforts. In addition to this benchmarking, I annotated the genome of the sacred lotus Nelumbo Nucifera. I enabled noncoding RNA annotation in MAKER by adding the ability for MAKER to run and process the outputs of tRNAscan-SE and snoscan. These functionalities were tested on the Arabidopsis genome and used MAKER to annotate tRNAs and snoRNAs in Zea mays. The resulting version of MAKER was named MAKER-P. I added the functionality of a combiner by adding EVidence Modeler to the MAKER code base. iv As the number of MAKER users has grown, so have the help requests sent to the MAKER developers list. Motivated by the belief that improving the MAKER documentation would obviate the need for many of these requests, I created a media wiki that was linked to the MAKER download page, and the MAKER developers list was made searchable. Additionally I have written a unit on genome annotation using MAKER for Current Protocols in Bioinformatics. In response to these efforts I have seen a corresponding decrease in help requests, even though the number of registered MAKER users continues to increase. Taken together these products and activities have moved MAKER into the domain of plant annotation, expanded MAKER to include new methods of gene and noncoding RNA annotation, and improved the usability of MAKER through documentation and community outreach

    f-scLVM: scalable and versatile factor analysis for single-cell RNA-seq.

    Get PDF
    Single-cell RNA-sequencing (scRNA-seq) allows studying heterogeneity in gene expression in large cell populations. Such heterogeneity can arise due to technical or biological factors, making decomposing sources of variation difficult. We here describe f-scLVM (factorial single-cell latent variable model), a method based on factor analysis that uses pathway annotations to guide the inference of interpretable factors underpinning the heterogeneity. Our model jointly estimates the relevance of individual factors, refines gene set annotations, and infers factors without annotation. In applications to multiple scRNA-seq datasets, we find that f-scLVM robustly decomposes scRNA-seq datasets into interpretable components, thereby facilitating the identification of novel subpopulations

    Conserved imprinting associated with unique epigenetic signatures in the Arabidopsis genus

    Get PDF
    In plants, imprinted gene expression occurs in endosperm seed tissue and is sometimes associated with differential DNA methylation between maternal and paternal alleles1. Imprinting is theorized to have been selected for because of conflict between parental genomes in offspring2, but most studies of imprinting have been conducted in Arabidopsis thaliana, an inbred primarily self-fertilizing species that should have limited parental conflict. We examined embryo and endosperm allele-specific expression and DNA methylation genome-wide in the wild outcrossing species Arabidopsis lyrata. Here we show that the majority of A. lyrata imprinted genes also exhibit parentally biased expression in A. thaliana, suggesting that there is evolutionary conservation in gene imprinting. Surprisingly, we discovered substantial interspecies differences in methylation features associated with paternally expressed imprinted genes (PEGs). Unlike in A. thaliana, the maternal allele of many A. lyrata PEGs was hypermethylated in the CHG context. Increased maternal allele CHG methylation was associated with increased expression bias in favour of the paternal allele. We propose that CHG methylation maintains or reinforces repression of maternal alleles of PEGs. These data suggest that the genes subject to imprinting are largely conserved, but there is flexibility in the epigenetic mechanisms employed between closely related species to maintain monoallelic expression. This supports the idea that imprinting of specific genes is a functional phenomenon, and not simply a byproduct of seed epigenomic reprogramming. Genomic imprinting is a form of epigenetic gene regulation in flowering plants and mammals in which alleles of genes are expressed in a parent-of-origin dependent manner. Allele-specific gene expression profiling has identified hundreds of imprinted genes in A. thaliana, maize and rice endosperm, the functions of which are largely unknown. Allelic differences in DNA methylation and chromatin modification between maternal and paternal alleles are important for establishing and maintaining imprinted expression. The emerging picture from multiple species is that the paternal allele of PEGs is associated with DNA methylation, and the silent maternal allele is hypomethylated and bears the Polycomb Repressive Complex 2 (PRC2) mark H3K27me3. Several evolutionary theories have been proposed to describe processes that would select for fixation of this unusual pattern of gene expression13. The kinship or parental conflict theory posits that imprinting is selected for because of asymmetric relatedness among kin. In species where the maternal parent directly provisions growing progeny and has offspring by multiple males, maternally and paternally inherited genomes are predicted to have conflicting interests with regard to the extent of maternal investment. Paternally inherited alleles are expected to favour maternal investment at the expense of half-siblings. Low conservation of imprinting between A. thaliana and monocots, limited conservation between rice and maize, evidence for intraspecific variation in imprinting and lack of strong phenotypes for some imprinted gene mutants has cast doubt on whether imprinting of particular genes is functionally important. Additionally, although some imprinted genes are associated with differential methylation, it has been suggested that imprinted expression is simply a byproduct of endosperm DNA methylation changes—changes that could have a primary function outside imprinting regulation. We were motivated by these considerations and by predictions of the parental conflict theory to compare imprinting and seed DNA methylation between two closely related species that differ in breeding strategy. A. lyrata and A. thaliana diverged approximately 13 Myr ago. Although A. thaliana outcrosses to some extent in the wild, as an obligate outcrosser A. lyrata should be subject to a higher degree of parental conflict than A. thaliana and should therefore be under greater pressure to maintain imprinting. To identify A. lyrata imprinted genes, we performed mRNA-seq on parental strains and F1 hybrid embryo and endosperm tissue derived from crosses between the sequenced A. lyrata strain MN47 (MN) and a strain from Karhumäki (Kar) (Supplementary Fig. 1, Supplementary Fig. 2 and Supplementary Tables 1 and 2). After reannotating A. lyrata genes based on our extensive RNA-seq data (see Supplementary Methods), sequence polymorphisms between MN and Kar were used to quantify the contributions of each parental genome to gene expression. All possible pairwise comparisons (n = 12) of parent-of-origin bias among three MN × Kar and four Kar × MN reciprocal cross-replicates were performed to identify imprinted genes using the same criteria we previously applied to A. thaliana. Only genes that were defined as imprinted in at least 40% of comparisons were included in the final set (Fig. 1 and Supplementary Tables 3 and 4, see Supplementary Methods for details of imprinting criteria). This analysis yielded 49 PEGs and 35 maternally expressed imprinted genes (MEGs) in endosperm (Fig. 1a). Allele assignment calls for 13 genes, including both imprinted and non-imprinted genes, were validated by pyrosequencing (Supplementary Fig. 3). As expected3,5, there was little evidence for imprinting in embryos (Fig. 1a). We compared A. lyrata and A. thaliana endosperm imprinted genes (Fig. 1, Supplementary Fig. 4 and Supplementary Table 4). Of the A. lyrata PEGs for which there were sufficient data available in A. thaliana, 72% (26/36) were also paternally biased in A. thaliana, with 50% (18/36) meeting all stringent criteria for being designated as a PEG in both species (Fig. 1b). Conserved PEGs encoded DNA binding proteins and genes related to chromatin modification, among others (Supplementary Table 4). Of the A. lyrata MEGs for which there were sufficient data in A. thaliana, 70% (12/17) were also significantly maternally biased in A. thaliana, with 35% (6/17) meeting all criteria for being called a MEG in both datasets (Fig. 1b). The conserved MEGs included the Polycomb group gene FIS2, the F-box gene SDC, another F-box gene and three genes encoding DNA binding proteins. Although previous research has identified somewhat more imprinted genes in A. thaliana than what we describe in A. lyrata, these studies involved multiple accessions and assessed imprinting for a greater total number of genes. The majority of genes that were imprinted in A. thaliana but not in A. lyrata lacked sufficient data to make an imprinting designation in A. lyrata (Supplementary Fig. 4). Thus, it is presently unclear whether the number of imprinted genes differs significantly between the species. All of the genes that are commonly imprinted among A. thaliana and cereals were also imprinted in A. lyrata. Many mammalian imprinted genes are clearly involved in growth regulation, including genes for nutrient uptake and feeding behaviour. By contrast, we found that proteins encoded by conserved plant imprinted genes were predicted to regulate or affect the expression of many other genes (chromatin proteins and transcription factors) or protein abundance (F-boxes). We also found that some pathways, rather than orthologous genes, were imprinted in both species, as has been previously noted for imprinting of different subunits of the PRC2 complex among Arabidopsis and cereals. In A. thaliana, the large subunit of RNA Polymerase IV, NRPD1, which functions in RNA-directed DNA methylation (RdDM), is a PEG5,6. Although we did not find evidence for imprinting of the NRPD1 gene in A. lyrata, homologues of two other genes involved in RdDM were PEGs (Supplementary Table 4): NRPD4/NRPE4/RDM2 (AL946699), which encodes a common subunit of Pol IV and Pol V, and RRP6L1 (AL337734), which encodes an exosomal protein that impacts RdDM. Thus, in both species the function of RdDM in the endosperm is under paternal influence, but this is achieved through different genes. The kinship theory is essentially an argument about optimal total gene expression levels in offspring. We therefore evaluated the expression levels and patterns of imprinted genes. MEGs appear to be primarily endosperm-specific genes; they have much lower than average expression in embryos and flower buds, and much higher than average expression in the endosperm (Fig. 1c). Conversely, PEGs were more highly expressed in all tissues than genes on average, and showed more modest expression increases in endosperm, suggesting that the expression of MEGs and PEGs is regulated differently. We also compared the percentage of maternal transcripts for homologous imprinted A. lyrata and A. thaliana genes (Fig. 1d). Conserved MEGs and PEGs exhibited similar degrees of parental bias in the two species (Fig. 1d). However, comparison of the A. thaliana and A. lyrata gene expression level for individual imprinted genes indicated that the overall expression level of PEGs was higher in A. lyrata than in A. thaliana (Fig. 1e). These findings are consistent with selection for higher expression of PEGs in species with greater parental conflict, such as obligate outcrossers. In A. thaliana, active DNA demethylation by the 5-methylcytosine DNA glycosylase DME in the central cell (the female gamete that is the progenitor of the endosperm) before fertilization is essential for establishing gene imprinting at many loci1. Imprinting of many A. thaliana genes, particularly PEGs, is correlated with maternal allele demethylation of proximal sequences corresponding to fragments of transposable elements (TEs). A. lyrata PEGs were somewhat enriched for the presence of TEs in 5′ regions compared with all genes, with 30 out of 49 PEGs (61%) associated with at least one TE within 2 kb upstream, compared with 51% of all genes (Supplementary Table 4). To test if the relationship between methylation and imprinting was conserved in A. lyrata, we profiled genome-wide methylation in MN × MN flower bud, embryo and endosperm tissue by whole-genome bisulfite sequencing. Shared and novel endosperm methylation features were observed compared with A. thaliana (Figs 2 and 3, Supplementary Fig. 5 and Supplementary Table 5). In plants, DNA methylation is found in CG, CHG and CHH sequence contexts. CG methylation was strongly decreased in TEs and in the 5′ and 3′ regions of genes in endosperm relative to other tissues (Fig. 2a and Supplementary Fig. 5). By profiling allele-specific DNA methylation in the F1 embryo and endosperm from Kar females crossed with MN males, we determined that maternally inherited DNA was primarily responsible for endosperm CG hypomethylation (Fig. 2b). These data suggest that, as in A. thaliana, A. lyrata maternally inherited genomes are actively demethylated before fertilization.By contrast, we were surprised to discover that A. lyrata endosperm had a non-CG DNA methylation profile distinct from A. thaliana. This was unexpected because DNA methylation patterns in A. lyrata vegetative tissues display similar features to A. thaliana, although overall methylation levels are higher (Supplementary Fig. 5). We found that average CHG methylation in gene bodies was increased in endosperm compared with embryo (Fig. 2a), a phenotype not observed in wild-type A. thaliana endosperm profiled at similar developmental stages (Supplementary Fig. 5). To determine whether differences in aggregate methylation profiles represented small changes in many regions or larger changes in specific regions of the genome, we compared embryo and endosperm methylation profiles to identify differentially methylated regions (DMRs). As in A. thaliana, the most abundant class of DMRs were less CG methylated in the endosperm than the embryo, with 38% of these falling within 2 kb upstream of genes and 34% within 2 kb downstream of genes (Supplementary Table 6). Regions that gained CHG methylation in MN × MN endosperm displayed markedly different characteristics; 84% fell within gene bodies, corresponding to 1,606 genes (Fig. 2c and Supplementary Table 6). CHG endosperm hypermethylated DMRs were also longer than all other DMR types (mean length = 564 bp with 400 bp s.d.) (Supplementary Table 6). CHG gene body hypermethylation was also observed in Kar × MN endosperm, although on fewer genes (n = 194). Allele-specific analysis of methylation indicated that endosperm CHG hypermethylation was specific to maternally inherited alleles (Fig. 2d). Methylation within gene bodies is usually restricted to the CG context, which is maintained after DNA replication by the maintenance methyltransferase MET1. CHG methylation, normally not found in genes, is maintained by the DNA methyltransferase CMT3, which directly binds to the repressive histone modification H3K9me2 (ref. 23). When accompanied by H3K9me2, CHG gene body methylation is associated with transcriptional repression24. We found that gain of gene body CHG methylation in A. lyrata endosperm was associated with reduced gene expression (Supplementary Fig. 6). Of the CHG hypermethylated genes with enough coverage to evaluate differential expression (n = 1,225), 338 were significantly less expressed in endosperm than in embryo, compared with 159 significantly more highly expressed in endosperm. This represents a significant enrichment of CHG hypermethylated genes among genes less expressed in endosperm than embryo (P = 1.766 × 10–21, hypergeometric test) and a significant depletion among genes upregulated in endosperm (P = 2.04 × 10–10, see Supplementary Methods). The mechanism responsible for CHG gene body hypermethylation in A. lyrata endosperm remains unclear. We found significant overlap between A. thaliana genes that gain CHG methylation or H3K9me2 in ibm1 mutants and CHG hypermethylation of orthologous genes in A. lyrata endosperm (Supplementary Fig. 7). IBM1 encodes a histone lysine demethylase that prevents accumulation of H3K9me2, and thus accumulation of CHG methylation, in genes24. IBM1 transcript abundance was lower in the endosperm than embryo (Supplementary Fig. 7). In A. thaliana, methylation in the long intron of IBM1 is required for proper transcript splicing and production of an enzymatically active protein25. We found that A. lyrata IBM1 exhibited decreased CG and non-CG methylation and increased accumulation of RNA-seq reads in the long intron in endosperm relative to embryo (Supplementary Fig. 7). However, A. thaliana endosperm also had reduced methylation in the long intron and decreased IBM1 transcript abundance than the embryo (Supplementary Fig. 7). Thus, differences in IBM1 expression alone are not sufficient to explain CHG hypermethylation in A. lyrata endosperm compared with A. thaliana, although reduced IBM1 activity is likely to be part of the mechanism. Several of the observed endosperm methylation features were correlated with gene imprinting. More than half of the A. lyrata MEGs and approximately one-third of PEGs were associated with endosperm CG hypomethylated DMRs in the 2 kb region upstream of the transcriptional start site, whereas only 11% of non-imprinted genes were similarly associated with these DMRs (Fig. 3, Supplementary Table 4, Supplementary Fig. 8 and Supplementary Fig. 9). CG hypomethylation occurred specifically on the maternally inherited allele (Supplementary Fig. 8). Thus, reduction of CG methylation by active demethylation is also likely to be an important component of the A. lyrata imprinting mechanism. We found a striking and non-mutually exclusive association between PEGs and endosperm CHG hypermethylation. Almost 60% of PEG gene bodies (n = 27) were CHG hypermethylated, and about one-third were also associated with a 5′ or 3′ CG hypomethylated DMR (Supplementary Table 4). The average methylation profile of PEGs containing a CHG endosperm hypermethylated DMR indicated a very strong increase in CHG methylation across the entire gene body, which was specific to the maternally inherited allele (Figs 3 and 4). Results were validated for two PEGs, homologues of AT5G10950 and AT5G26210, by locus-specific bisulfite-PCR (Fig. 4 and Supplementary Fig. 10). In both A. thaliana and A. lyrata these genes were associated with CG or CHH endosperm hypomethylated DMRs in 5′ To determine if there was a quantitative relationship between gain of CHG methylation and allelic expression bias, we plotted the difference in CHG methylation between maternal alleles in the embryo and endosperm relative to the ratio of maternal to paternal allele transcripts (Fig. 3c). The degree to which CHG methylation was gained on the maternal allele in endosperm relative to embryo was positively correlated with the extent of paternal allele expression bias in endosperm. In addition, PEGs were clearly distinct from other genes that gained CHG gene body methylation; they tended to exhibit greater gain of CHG methylation (Fig. 3c) and were also hypermethylated along more of their length than all CHG hypermethylated genes (56 versus 29%) (Supplementary Tables 4 and 6). Thus, a greater extent and amount of maternal allele CHG hypermethylation is correlated with more paternally biased transcription. These data suggest that CHG methylation, perhaps accompanied by gain of H3K9me2, represses the maternal alleles of PEGs. It is unknown whether gene body CHG methylation is established on maternal alleles before or after fertilization. Demethylation of the IBM1 regulatory intron (Supplementary Fig. 7) could be initiated before fertilization in the central cell, leading to its downregulation and an increase in CHG methylation specifically on maternal alleles, which would then be maintained after fertilization. Alternatively, if maternal allele CHG methylation occurs after fertilization, then CMT3 must be able to distinguish maternally and paternally inherited alleles. Retention of CG gene body methylation on the paternal alleles of PEGs (Fig. 4) could possibly protect them from gain of CHG methylation. Interestingly, gain of gene body CHG methylation was also recently shown to occur in both A. thaliana endosperm and embryos when wild-type plants were pollinated by diploid hypomethylated pollen. Diploid pollen creates triploid seeds with tetraploid endosperm that usually abort, but seed abortion is suppressed when the pollen is hypomethylated owing to mutations in met. Many of the genes that gain CHG methylation and have reduced expression in triploid rescued seeds are PEGs. However, this phenotype appears to be distinct from what we observed; the CHG methylation gain is much more modest than what we have described in wild-type A. lyrata endosperm, and only one conserved PEG was affected. Our data further suggest that gene body CHG hypermethylation is not a state restricted to mutant tissues, but can occur in a developmentally regulated manner that could be important for maintaining gene expression programmes. This is the first study to compare imprinting between two closely related plant species that differ in breeding strategy. A. lyrata and A. thaliana homologous imprinted genes are epigenetically modified in a distinct manner despite the close relatedness of the species (Fig. 4). Allele-specific maintenance of gene repression by the PRC2 complex is an important component of the imprinting mechanism in A. thaliana and other species. The PRC2 complex silences the hypomethylated maternal allele of PEGs, and the methylated paternal allele is expressed. Several studies have suggested that H3K9me2 and H3K27me3 are repressive marks that can substitute for one another in mutant contexts. We suggest that this substitution can also occur in wild-type tissues, and favour the hypothesis that in A. lyrata endosperm the maternal allele of at least a subset of PEGs is repressed by CHG methylation/H3K9me2. Overall, our results point to high conservation of imprinting accompanied by a distinct epigenetic signature, at least for PEGs. If the mechanism of imprinting is different but the genes that are imprinted are the same, this argues that imprinting is not simply a byproduct of endosperm methylation dynamics, but that imprinted expression of specific genes is under selection. Thus, the means by which monoallelic expression can be achieved are plastic, but the genes subject to this regulation are conserved.regions, but were additionally associated with gene body CHG hypermethylated DMRs in A. lyrata. Interestingly, gain of CHG methylation on the maternal allele was often accompanied by loss of CG gene body methylation, whereas paternally inherited alleles retained CG gene body methylation and had a similar methylation profile to embryo alleles (Fig. 4 and Supplementary Table 4). For the 22 PEGs lacking a gene body CHG hypermethylated DMR, half had a CG hypomethylated DMR in the flanking regions 2 kb 5′ or 3′, more like typical A. thaliana PEGs (Supplementary Table 4). Interestingly, these genes largely lacked CG gene body methylation in all tissues (Fig. 3a). Thus, there appear to be at least two classes of PEGs in terms of methylation features (Fig. 3a), which may correspond to different modes of epigenetic regulation. PEGs conserved with A. thaliana are found in both classes, although the majority (12/18) are CHG hypermethylated (Supplementary Table 4).To determine if there was a quantitative relationship between gain of CHG methylation and allelic expression bias, we plotted the difference in CHG methylation between maternal alleles in the embryo and endosperm relative to the ratio of maternal to paternal allele transcripts (Fig. 3c). The degree to which CHG methylation was gained on the maternal allele in endosperm relative to embryo was positively correlated with the extent of paternal allele expression bias in endosperm. In addition, PEG

    Complex evolutionary dynamics of massively expanded chemosensory receptor families in an extreme generalist chelicerate herbivore

    Get PDF
    While mechanisms to detoxify plant produced, anti-herbivore compounds have been associated with plant host use by herbivores, less is known about the role of chemosensory perception in their life histories. This is especially true for generalists, including chelicerate herbivores that evolved herbivory independently from the more studied insect lineages. To shed light on chemosensory perception in a generalist herbivore, we characterized the chemosensory receptors (CRs) of the chelicerate two-spotted spider mite, Tetranychus urticae, an extreme generalist. Strikingly, T. urticae has more CRs than reported in any other arthropod to date. Including pseudogenes, 689 gustatory receptors were identified, as were 136 degenerin/Epithelial Na+ Channels (ENaCs) that have also been implicated as CRs in insects. The genomic distribution of T. urticae gustatory receptors indicates recurring bursts of lineage-specific proliferations, with the extent of receptor clusters reminiscent of those observed in the CR-rich genomes of vertebrates or C. elegans. Although pseudogenization of many gustatory receptors within clusters suggests relaxed selection, a subset of receptors is expressed. Consistent with functions as CRs, the genomic distribution and expression of ENaCs in lineage-specific T. urticae expansions mirrors that observed for gustatory receptors. The expansion of ENaCs in T. urticae to > 3-fold that reported in other animals was unexpected, raising the possibility that ENaCs in T. urticae have been co-opted to fulfill a major role performed by unrelated CRs in other animals. More broadly, our findings suggest an elaborate role for chemosensory perception in generalist herbivores that are of key ecological and agricultural importance

    An improved assembly and annotation of the melon (Cucumis melo L.) reference genome

    Get PDF
    We report an improved assembly (v3.6.1) of the melon (Cucumis melo L.) genome and a new genome annotation (v4.0). The optical mapping approach allowed correcting the order and the orientation of 21 previous scaffolds and permitted to correctly define the gap-size extension along the 12 pseudomolecules. A new comprehensive annotation was also built in order to update the previous annotation v3.5.1, released more than six years ago. Using an integrative annotation pipeline, based on exhaustive RNA-Seq collections and ad-hoc transposable element annotation, we identified 29,980 protein-coding loci. Compared to the previous version, the v4.0 annotation improved gene models in terms of completeness of gene structure, UTR regions definition, intron-exon junctions and reduction of fragmented genes. More than 8,000 new genes were identified, one third of them being well supported by RNA-Seq data. To make all the new resources easily exploitable and completely available for the scientific community, a redesigned Melonomics genomic platform was released at http://melonomics.net. The resources produced in this work considerably increase the reliability of the melon genome assembly and resolution of the gene models paving the way for further studies in melon and related species

    Integration of host-pathogen functional genomics data into the chromosome-level genome assembly of turbot (Scophthalmus maximus)

    Get PDF
    Disease resilience is of utmost relevance for turbot aquaculture. Several infective diseases, covering a broad spectrum from viruses, bacteria to different parasites, have been identified by industry. Since they increase mortality rates, reduce feed conversion ratios and slow down growth rate, genetic breeding programs for increasing disease resilience are recognized as a useful alternative for controlling pathologies. For this, knowledge of the genetic basis underlying resilience using genomic tools is essential to develop the best effective breeding strategies. In the present study, we compiled the existing genomic information generated in the last decade to construct an integrated atlas of candidate genes and genomic regions involved in pathogen resistance against the main turbot industrial pathogens (Aeromonas salmonicida, Philasterides dicentrarchi, Enteromyxum scophthalmi and the VHS virus) within the chromosome-level turbot genome assembly recently released. Information comprehends reannotated differentially expressed genes (DEG) in different tissues along temporal series, QTL markers associated with important productive traits (disease resistance and growth) and signatures of domestic or wild selection, represented by runs of homozygosity (ROHi) islands and outlier markers for divergent selection. Most genetic features were successfully relocated in the turbot assembly including 81.1% of the total DEGs, plus all QTL markers, ROHi and outlier markers. The updated annotation of DEGs for resistance to each pathology demonstrated significant changes. While the new annotation of 53–83% of the DEGs was coherent with the original, roughly 10–24% showed imprecise annotations in both assembly versions, ∼5% lost their original annotation and 2–24% were now annotated. Functional enrichment revealed mostly functions related to immune response, such as chemotaxis, apoptosis regulation, leukocyte differentiation, cell adhesion, iron homeostasis and vascular permeability. Some DEGs, such as celsr1a (cadherin EGF LAG seen-pass G-type receptor 1), fgg (fibrinogen gamma chain) and c1qtnf9 (C1q and TNF related 9) were found near pathogen-associated QTL markers. Also, some shared DEGs for resistance to all pathogens were positioned near QTL markers or ROHi, such as hamp (hepcidin-1), plg (plasminogen) and a fibrinogen alpha chain-like gene. Overall, our results provide an integrative insight into the genetic architecture of turbot response to a range of pathogens that could prove useful for future genomic studies to benefit aquaculture breeding programsS
    corecore