8 research outputs found

    Secuenciación y análisis del transcriptoma de dalbulusmaidis

    Get PDF
    Los auquenorrincos (chicharritas o cotorritas) son insectos exclusivamente fitófagos, que pueden causar importantes daños económicos sobre los cultivos. Una de las enfermedades vectorizadas por ellos es el achaparramiento del maíz o Corn Stunt Disease, potencialmente una de las enfermedades más serias del cultivo de maíz, capaz de causar pérdidas parciales o totales en la producci ón en las zonas afectadas. En Argentina, Dalbulusmaidis (Hemiptera: Auchenorrhyncha) es el único vector a campo conocido como transmisor del Spiroplasmakunkelii , patógeno causal del Corn Stunt . Dada su importancia como plaga en la agricultura, se secuenció el transcriptoma de todos los estadios del ciclo de vida de este insecto (huevos, 5 estadios ninfales y dos muestras de adultos). Se utilizó un pool de insectos para abarcar la mayor cantidad de genes expresados. Como la información genómica de Dalbulusma idis no está disponible, se realizó el ensamblado de novo . Se compararon los ensambles realizados con 3 programas: VELVET OASES, ABySS y Trinity. Se evaluaron utilizando métricas (N50, longitud de contig ) y medidas de cobertura (CEG, BUSCO). En base a es tosanálisis, se decidió buscar genes del desarrollo en los ensambles de VELVET OASES y Trinity. El porcentaje total de genes encontrado fue mayor para el ensamble de Trinity. Teniendo en cuenta los resultados previos, se ensamblaron el resto de las muest ras con Trinity, obteniendo valores de métricas y coberturas muy buenos. Además se compararon los transcriptomas con proteomas publicados como medida de homología entre especies. En este trabajo se compararon distintos métodos de ensamble de novo y se selec cionó el que mejor se adaptó a nuestros datos y experimentosFil: Palacio, Victorio Gabriel. Universidad Nacional del Noroeste de la Provincia de Buenos AiresFil: Lavore, Andrés. Universidad Nacional del Noroeste de la Provincia de Buenos AiresFil: Catalano, María Inés . Universidad Nacional del Noroeste de la Provincia de Buenos AiresFil: Rivera Pomar, Rolando . Universidad Nacional del Noroeste de la Provincia de Buenos Aire

    Consensus Ensemble Approaches Improve De Novo Transcriptome Assemblies

    Get PDF
    Accurate and comprehensive transcriptome assemblies lay the foundation for a range of analyses, such as differential gene expression analysis, metabolic pathway reconstruction, novel gene discovery, or metabolic flux analysis. With the arrival of next-generation sequencing technologies it has become possible to acquire the whole transcriptome data rapidly even from non-model organisms. However, the problem of accurately assembling the transcriptome for any given sample remains extremely challenging, especially in species with a high prevalence of recent gene or genome duplications, those with alternative splicing of transcripts, or those whose genomes are not well studied. This thesis provides a detailed overview of the strategies used for transcriptome assembly, including a review of the different statistics available for measuring the quality of transcriptome assemblies with the emphasis on the types of errors each statistic does and does not detect and simulation protocols to computationally generate RNAseq data that present biologically realistic problems such as gene expression bias and alternative splicing. Using such simulated RNAseq data, a comparison of the accuracy, strengths, and weaknesses of seven representative assemblers including de novo, genome-guided methods shows that all of the assemblers individually struggle to accurately reconstruct the expressed transcriptome, especially for alternative splice forms. Using a consensus of several de novo assemblers can overcome many of the weaknesses of individual assemblers, generating an ensemble assembly with higher accuracy than any individual assembler. Advisor: Jitender S. Deogu

    The Oyster River Protocol: a multi-assembler and kmer approach for de novo transcriptome assembly

    No full text
    Characterizing transcriptomes in non-model organisms has resulted in a massive increase in our understanding of biological phenomena. This boon, largely made possible via high-throughput sequencing, means that studies of functional, evolutionary, and population genomics are now being done by hundreds or even thousands of labs around the world. For many, these studies begin with a de novo transcriptome assembly, which is a technically complicated process involving several discrete steps. The Oyster River Protocol (ORP), described here, implements a standardized and benchmarked set of bioinformatic processes, resulting in an assembly with enhanced qualities over other standard assembly methods. Specifically, ORP produced assemblies have higher Detonate and TransRate scores and mapping rates, which is largely a product of the fact that it leverages a multi-assembler and kmer assembly process, thereby bypassing the shortcomings of any one approach. These improvements are important, as previously unassembled transcripts are included in ORP assemblies, resulting in a significant enhancement of the power of downstream analysis. Further, as part of this study, I show that assembly quality is unrelated with the number of reads generated, above 30 million reads. Code Availability: The version controlled open-source code is available at https://github.com/macmanes-lab/Oyster_River_Protocol. Instructions for software installation and use, and other details are available at http://oyster-river-protocol.rtfd.org/

    Transcriptome Analysis for Non-Model Organism: Current Status and Best-Practices

    Get PDF
    Since transcriptome analysis provides genome-wide sequence and gene expression information, transcript reconstruction using RNA-Seq sequence reads has become popular during recent years. For non-model organism, as distinct from the reference genome-based mapping, sequence reads are processed via de novo transcriptome assembly approaches to produce large numbers of contigs corresponding to coding or non-coding, but expressed, part of genome. In spite of immense potential of RNA-Seq–based methods, particularly in recovering full-length transcripts and spliced isoforms from short-reads, the accurate results can be only obtained by the procedures to be taken in a step-by-step manner. In this chapter, we aim to provide an overview of the state-of-the-art methods including (i) quality check and pre-processing of raw reads, (ii) the pros and cons of de novo transcriptome assemblers, (iii) generating non-redundant transcript data, (iv) current quality assessment tools for de novo transcriptome assemblies, (v) approaches for transcript abundance and differential expression estimations and finally (vi) further mining of transcriptomic data for particular biological questions. Our intention is to provide an overview and practical guidance for choosing the appropriate approaches to best meet the needs of researchers in this area and also outline the strategies to improve on-going projects

    A consensus‑based ensemble approach to improve transcriptome assembly

    Get PDF
    Background: Systems-level analyses, such as differential gene expression analysis, co-expression analysis, and metabolic pathway reconstruction, depend on the accuracy of the transcriptome. Multiple tools exist to perform transcriptome assembly from RNAseq data. However, assembling high quality transcriptomes is still not a trivial problem. This is especially the case for non-model organisms where adequate reference genomes are often not available. Different methods produce different transcriptome models and there is no easy way to determine which are more accurate. Furthermore, having alternative-splicing events exacerbates such difficult assembly problems. While benchmarking transcriptome assemblies is critical, this is also not trivial due to the general lack of true reference transcriptomes. Results: In this study, we first provide a pipeline to generate a set of the simulated benchmark transcriptome and corresponding RNAseq data. Using the simulated benchmarking datasets, we compared the performance of various transcriptome assembly approaches including both de novo and genome-guided methods. The results showed that the assembly performance deteriorates significantly when alternative transcripts (isoforms) exist or for genome-guided methods when the reference is not available from the same genome. To improve the transcriptome assembly performance, leveraging the overlapping predictions between different assemblies, we present a new consensus-based ensemble transcriptome assembly approach, ConSemble. Conclusions: Without using a reference genome, ConSemble using four de novo assemblers achieved an accuracy up to twice as high as any de novo assemblers we compared. When a reference genome is available, ConSemble using four genomeguided assemblies removed many incorrectly assembled contigs with minimal impact on correctly assembled contigs, achieving higher precision and accuracy than individual genome-guided methods. Furthermore, ConSemble using de novo assemblers matched or exceeded the best performing genome-guided assemblers even when the transcriptomes included isoforms. We thus demonstrated that the ConSemble consensus strategy both for de novo and genome-guided assemblers can improve transcriptome assembly. The RNAseq simulation pipeline, the benchmark transcriptome datasets, and the script to perform the ConSemble assembly are all freely available from: http:// bioin folab. unl. edu/ emlab/ conse mble/

    De novo transkriptomika a její využití u nemodelových organismů

    Get PDF
    The rise of second generation sequencing enabled the study of non-model organisms. Without the requirement of having a reference genome, de novo transcriptomics allows the study of functional elements of their genomes. That way, the great complexity of non-model organisms can be explored. This thesis gives a comprehensive overview of the de novo transcriptomics experiment workflow from a bioinformatics perspective. The emphasis was placed on both theoretical background and practical approaches. This work also highlights new methods in de novo transcriptomics which may start to dominate in the near future. The practical part of the work presents transXpress - a de novo transcriptome assembly and annotation pipeline. Its use is demonstrated on a non-model plant long pepper (Piper longum) with medicinal potential. Keywords: transcriptomics, de novo transcriptomics, transcriptome, RNA-Seq, non-model organism, assemblyRozvoj sekvenování druhé generace umožnil studium nemodelových organismů. Bez nutnosti mít referenční genom k dispozici, de novo transkriptomika umožňuje studium funkčních elementů genomů. Díky tomu je možné zkoumat komplexitu nemodelových organismů. Tato práce poskytuje ucelený přehled kroků de novo studia transkriptomů z pohledu bioinformatiky. Důraz byl kladen na teoretické základy i na praktické přístupy. Práce rovněž představí nové metody de novo transkriptomiky, které mohou v blízké budoucnosti začít dominovat. Praktická část práce představuje transXpress - pipeline pro de novo sestavování transkriptomů a jejich anotaci. Jeho použití je ukázáno na nemodelové rostlině pepřovníku dlouhém (Piper longum), který má medicinální potenciál. Klíčová slova: transkriptomika, de novo transkriptomika, transkriptom, RNA-Seq, nemodelový organismus, assemblyDepartment of Cell BiologyKatedra buněčné biologiePřírodovědecká fakultaFaculty of Scienc

    De novo Nd-1 genome assembly reveals genomic diversity of Arabidopsis thaliana and facilitates genome-wide non-canonical splice site analysis across plant species

    Get PDF
    Pucker B. De novo Nd-1 genome assembly reveals genomic diversity of Arabidopsis thaliana and facilitates genome-wide non-canonical splice site analysis across plant species. Bielefeld: Universität Bielefeld; 2019

    A Systems Biology approach to understanding and monitoring chemical toxicity in the environment

    Get PDF
    Chemicals pose every day a continuous hazard to both human health and environment. Unfortunately, Information about chemicals Mode of Action (MoA) for most of these compounds is limited. Development of approaches able to elucidate chemicals mechanisms of action is needed in order to improve risk assessment. Environmental omics aims to provide tools and methodologies to address these goals. Omics technologies in combination with system biology approaches have the potential to provide a powerful toolbox for understanding chemicals mode of action and consequently the outcomes these compounds trigger. The work presented in this thesis demonstrates the effectiveness of such approach in the context of environmentally relevant species. More specifically I focused on characterization of single chemical and chemical class toxicity mechanism in zebrafish embryos (Danio rerio) and in a fish gill cell line (Rainbow trout) and I demonstrated that the transcriptional state of an in vitro system exposed to a panel of environmentally relevant chemicals can be used as a biosensor to predict toxicity in an in vivo system. I also developed a computational model of ovary development in Largemouth bass (Micropterus salmoides) and used this to successfully identify chemical compounds with the ability to affect reproduction. Lastly, I developed a method to identify novel endocrine disrupting compounds in Daphnia magna supporting the use of this species for rapid screening in risk assessment. My results demonstrated the potential of system biology and data-driven science in identifying novel mechanisms of environmental toxicity and to develop a set of biomarkers for monitoring purposes. Further development building on these findings could potentially lead to improvements in risk assessment
    corecore