16 research outputs found

    Consensus Ensemble Approaches Improve De Novo Transcriptome Assemblies

    Get PDF
    Accurate and comprehensive transcriptome assemblies lay the foundation for a range of analyses, such as differential gene expression analysis, metabolic pathway reconstruction, novel gene discovery, or metabolic flux analysis. With the arrival of next-generation sequencing technologies it has become possible to acquire the whole transcriptome data rapidly even from non-model organisms. However, the problem of accurately assembling the transcriptome for any given sample remains extremely challenging, especially in species with a high prevalence of recent gene or genome duplications, those with alternative splicing of transcripts, or those whose genomes are not well studied. This thesis provides a detailed overview of the strategies used for transcriptome assembly, including a review of the different statistics available for measuring the quality of transcriptome assemblies with the emphasis on the types of errors each statistic does and does not detect and simulation protocols to computationally generate RNAseq data that present biologically realistic problems such as gene expression bias and alternative splicing. Using such simulated RNAseq data, a comparison of the accuracy, strengths, and weaknesses of seven representative assemblers including de novo, genome-guided methods shows that all of the assemblers individually struggle to accurately reconstruct the expressed transcriptome, especially for alternative splice forms. Using a consensus of several de novo assemblers can overcome many of the weaknesses of individual assemblers, generating an ensemble assembly with higher accuracy than any individual assembler. Advisor: Jitender S. Deogu

    Next-Generation Transcriptome Assembly: Strategies and Performance Analysis

    Get PDF
    Accurate and comprehensive transcriptome assemblies lay the foundation for a range of analyses, such as differential gene expression analysis, metabolic pathway reconstruction, novel gene discovery, or metabolic flux analysis. With the arrival of next-generation sequencing technologies, it has become possible to acquire the whole transcriptome data rapidly even from non-model organisms. However, the problem of accurately assembling the transcriptome for any given sample remains extremely challenging, especially in species with a high prevalence of recent gene or genome duplications, those with alternative splicing of transcripts, or those whose genomes are not well studied. In this chapter, we provided a detailed overview of the strategies used for transcriptome assembly. We reviewed the different statistics available for measuring the quality of transcriptome assemblies with the emphasis on the types of errors each statistic does and does not detect. We also reviewed simulation protocols to computationally generate RNAseq data that present biologically realistic problems such as gene expression bias and alternative splicing. Using such simulated RNAseq data, we presented a comparison of the accuracy, strengths, and weaknesses of nine representative transcriptome assemblers including de novo, genome-guided, and ensemble methods

    Small RNAs \u3e26 nt in length associate with AGO1 and are upregulated by nutrient deprivation in the alga Chlamydomonas

    Get PDF
    Small RNAs (sRNAs) associate with ARGONAUTE (AGO) proteins forming effector complexes with key roles in gene regulation and defense responses against molecular parasites. In multicellular eukaryotes, extensive duplication and diversification of RNA interference (RNAi) components have resulted in intricate pathways for epigenetic control of gene expression. The unicellular alga Chlamydomonas reinhardtii also has a complex RNAi machinery, including 3 AGOs and 3 DICER-like proteins. However, little is known about the biogenesis and function of most endogenous sRNAs. We demonstrate here that Chlamydomonas contains uncommonly long (\u3e26 nt) sRNAs that associate preferentially with AGO1. Somewhat reminiscent of animal PIWI-interacting RNAs, these \u3e26 nt sRNAs are derived from moderately repetitive genomic clusters and their biogenesis is DICER-independent. Interestingly, the sequences generating these \u3e26-nt sRNAs have been conserved and amplified in several Chlamydomonas species. Moreover, expression of these longer sRNAs increases substantially under nitrogen or sulfur deprivation, concurrently with the downregulation of predicted target transcripts. We hypothesize that the transposon-like sequences from which \u3e26-nt sRNAs are produced might have been ancestrally targeted for silencing by the RNAi machinery but, during evolution, certain sRNAs might have fortuitously acquired endogenous target genes and become integrated into gene regulatory networks

    A consensus‑based ensemble approach to improve transcriptome assembly

    Get PDF
    Background: Systems-level analyses, such as differential gene expression analysis, co-expression analysis, and metabolic pathway reconstruction, depend on the accuracy of the transcriptome. Multiple tools exist to perform transcriptome assembly from RNAseq data. However, assembling high quality transcriptomes is still not a trivial problem. This is especially the case for non-model organisms where adequate reference genomes are often not available. Different methods produce different transcriptome models and there is no easy way to determine which are more accurate. Furthermore, having alternative-splicing events exacerbates such difficult assembly problems. While benchmarking transcriptome assemblies is critical, this is also not trivial due to the general lack of true reference transcriptomes. Results: In this study, we first provide a pipeline to generate a set of the simulated benchmark transcriptome and corresponding RNAseq data. Using the simulated benchmarking datasets, we compared the performance of various transcriptome assembly approaches including both de novo and genome-guided methods. The results showed that the assembly performance deteriorates significantly when alternative transcripts (isoforms) exist or for genome-guided methods when the reference is not available from the same genome. To improve the transcriptome assembly performance, leveraging the overlapping predictions between different assemblies, we present a new consensus-based ensemble transcriptome assembly approach, ConSemble. Conclusions: Without using a reference genome, ConSemble using four de novo assemblers achieved an accuracy up to twice as high as any de novo assemblers we compared. When a reference genome is available, ConSemble using four genomeguided assemblies removed many incorrectly assembled contigs with minimal impact on correctly assembled contigs, achieving higher precision and accuracy than individual genome-guided methods. Furthermore, ConSemble using de novo assemblers matched or exceeded the best performing genome-guided assemblers even when the transcriptomes included isoforms. We thus demonstrated that the ConSemble consensus strategy both for de novo and genome-guided assemblers can improve transcriptome assembly. The RNAseq simulation pipeline, the benchmark transcriptome datasets, and the script to perform the ConSemble assembly are all freely available from: http:// bioin folab. unl. edu/ emlab/ conse mble/

    Complete Genome Sequence of Geobacter sp. Strain FeAm09, a Moderately Acidophilic Soil Bacterium

    Get PDF
    A moderately acidophilic Geobacter sp. strain, FeAm09, was isolated from forest soil. The complete genome sequence is 4,099,068 bp with an average GC content of 61.1%. No plasmids were detected. The genome contains a total of 3,843 genes and 3,608 protein-coding genes, including genes supporting iron and nitrogen biogeochemical cycling

    Divergent evolution of extreme production of variant plant monounsaturated fatty acids

    Get PDF
    Metabolic extremes provide opportunities to understand enzymatic and metabolic plasticity and biotechnological tools for novel biomaterial production. We discovered that seed oils of many Thunbergia species contain up to 92% of the unusual monounsaturated petroselinic acid (18:1Δ6), one of the highest reported levels for a single fatty acid in plants. Supporting the biosynthetic origin of petroselinic acid, we identified a Δ6-stearoyl-acyl carrier protein (18:0-ACP) desaturase from Thunbergia laurifolia, closely related to a previously identified Δ6-palmitoyl-ACP desaturase that produces sapienic acid (16:1Δ6)- rich oils in Thunbergia alata seeds. Guided by a T. laurifolia desaturase crystal structure obtained in this study, enzyme mutagenesis identified key amino acids for functional divergence of Δ6 desaturases from the archetypal Δ9-18:0-ACP desaturase and mutations that result in nonnative enzyme regiospecificity. Furthermore, we demonstrate the utility of the T. laurifolia desaturase for the production of unusual monounsaturated fatty acids in engineered plant and bacterial hosts. Through stepwise metabolic engineering, we provide evidence that divergent evolution of extreme petroselinic acid and sapienic acid production arises from biosynthetic and metabolic functional specialization and enhanced expression of specific enzymes to accommodate metabolism of atypical substrates

    Consensus Ensemble Approaches Improve De Novo Transcriptome Assemblies

    Get PDF
    Accurate and comprehensive transcriptome assemblies lay the foundation for a range of analyses, such as differential gene expression analysis, metabolic pathway reconstruction, novel gene discovery, or metabolic flux analysis. With the arrival of next-generation sequencing technologies it has become possible to acquire the whole transcriptome data rapidly even from non-model organisms. However, the problem of accurately assembling the transcriptome for any given sample remains extremely challenging, especially in species with a high prevalence of recent gene or genome duplications, those with alternative splicing of transcripts, or those whose genomes are not well studied. This thesis provides a detailed overview of the strategies used for transcriptome assembly, including a review of the different statistics available for measuring the quality of transcriptome assemblies with the emphasis on the types of errors each statistic does and does not detect and simulation protocols to computationally generate RNAseq data that present biologically realistic problems such as gene expression bias and alternative splicing. Using such simulated RNAseq data, a comparison of the accuracy, strengths, and weaknesses of seven representative assemblers including de novo, genome-guided methods shows that all of the assemblers individually struggle to accurately reconstruct the expressed transcriptome, especially for alternative splice forms. Using a consensus of several de novo assemblers can overcome many of the weaknesses of individual assemblers, generating an ensemble assembly with higher accuracy than any individual assembler. Advisor: Jitender S. Deogu

    Investigating the Role of MicroRNAs in the Response to Nitrogen Deprivation in the Green Alga \u3ci\u3eChlamydomonas reinhardtii\u3c/i\u3e

    Get PDF
    Microalgae are gaining attention as a potential feedstock for the production of biodiesel, mainly derived from triacylglycerols (TAG). In many algae, TAG synthesis increases dramatically upon certain stresses but this is often accompanied by growth retardation. Rational improvements to strain productivity are limited by the scant knowledge on algal lipid metabolism and gene regulatory mechanisms. In this context, systems-level approaches aimed at understanding and modeling metabolic and regulatory networks may enable hypothesis-driven genetic engineering strategies. The green microalga Chlamydomonas reinhardtii accumulates significant amounts of TAGs under nutrient starvation and provides a genetically tractable model for manipulating biosynthetic pathways. In order to gain insight into Chlamydomonas TAG metabolism and regulation, we have examined both the transcriptome of strain CC-125 grown photoautotrophically in nutrient-replete or nitrogen-depleted media and the corresponding changes in microRNA population. While the production of microRNAs (miRNAs) by Chlamydomonas reinhardtii has been established for several years, little is known about how they target transcripts for regulation or what roles they play in cellular processes, in particular whether they play a role in regulating the accumulation of TAG in nitrogen-depleted media. To characterize functional miRNAs in Chlamydomonas, we identified small RNAs associated with Flag-tagged-AGO3 by affinity purification and deep sequencing in cells grown heterotrophically and cells grown photoautotrophically in nitrogen-replete and nitrogen-deplete media and used these small RNAs to determine changes in the miRNA populations across these three conditions. We determined the role that these miRNAs play in regulating the response to nitrogen-deplete media by searching the genes that are differentially expressed in that condition for potential targets of these miRNA. Adviser: Heriberto Cerutt

    Investigating the Role of MicroRNAs in the Response to Nitrogen Deprivation in the Green Alga Chlamydomonas reinhardtii

    No full text
    Microalgae are gaining attention as a potential feedstock for the production of biodiesel, mainly derived from triacylglycerols (TAG). In many algae, TAG synthesis increases dramatically upon certain stresses but this is often accompanied by growth retardation. Rational improvements to strain productivity are limited by the scant knowledge on algal lipid metabolism and gene regulatory mechanisms. In this context, systems-level approaches aimed at understanding and modeling metabolic and regulatory networks may enable hypothesis-driven genetic engineering strategies. The green microalga Chlamydomonas reinhardtii accumulates significant amounts of TAGs under nutrient starvation and provides a genetically tractable model for manipulating biosynthetic pathways. In order to gain insight into Chlamydomonas TAG metabolism and regulation, we have examined both the transcriptome of strain CC-125 grown photoautotrophically in nutrient-replete or nitrogen-depleted media and the corresponding changes in microRNA population. While the production of microRNAs (miRNAs) by Chlamydomonas reinhardtii has been established for several years, little is known about how they target transcripts for regulation or what roles they play in cellular processes, in particular whether they play a role in regulating the accumulation of TAG in nitrogen-depleted media. To characterize functional miRNAs in Chlamydomonas, we identified small RNAs associated with Flag-tagged-AGO3 by affinity purification and deep sequencing in cells grown heterotrophically and cells grown photoautotrophically in nitrogen-replete and nitrogen-deplete media and used these small RNAs to determine changes in the miRNA populations across these three conditions. We determined the role that these miRNAs play in regulating the response to nitrogen-deplete media by searching the genes that are differentially expressed in that condition for potential targets of these miRNA

    Complementarity to an miRNA Seed Region Is Sufficient to Induce Moderate Repression of a Target Transcript in the Unicellular Green Alga \u3ci\u3eChlamydomonas reinhardtii\u3c/i\u3e

    Get PDF
    MicroRNAs (miRNAs) are 20–24 nt noncoding RNAs that play important regulatory roles in a broad range of eukaryotes by pairing with mRNAs to direct post-transcriptional repression. The mechanistic details of miRNA-mediated post-transcriptional regulation have been well documented in multicellular model organisms. However, this process remains poorly studied in algae such as Chlamydomonas reinhardtii, and specific features of miRNA biogenesis, target mRNA recognition and subsequent silencing are not well understood. In this study, we report on the characterization of a Chlamydomonas miRNA, cre-miR1174.2, , which is processed from a near-perfect hairpin RNA. Using Gaussia luciferase (gluc) reporter genes, we have demonstrated that cre-miR1174.2 is functional in Chlamydomonas and capable of triggering site-specific cleavage at the center of a perfectly complementary target sequence. A mismatch tolerance test assay, based on pools of transgenic strains, revealed that target hybridization to nucleotides of the seed region, at the 5′ end of an miRNA, was sufficient to induce moderate repression of expression. In contrast, pairing to the 3′ region of the miRNA was not critical for silencing. Our results suggest that the base-pairing requirements for small RNA-mediated repression in C. reinhardtii are more similar to those of metazoans compared with the extensive complementarity that is typical of land plants. Individual Chlamydomonas miRNAs may potentially modulate the expression of numerous endogenous targets as a result of these relaxed base-pairing requirements
    corecore