87 research outputs found

    Improving transcriptome assembly through error correction of high-throughput sequence reads

    Get PDF
    The study of functional genomics--particularly in non-model organisms has been dramatically improved over the last few years by use of transcriptomes and RNAseq. While these studies are potentially extremely powerful, a computationally intensive procedure--the de novo construction of a reference transcriptome must be completed as a prerequisite to further analyses. The accurate reference is critically important as all downstream steps, including estimating transcript abundance are critically dependent on the construction of an accurate reference. Though a substantial amount of research has been done on assembly, only recently have the pre-assembly procedures been studied in detail. Specifically, several stand-alone error correction modules have been reported on, and while they have shown to be effective in reducing errors at the level of sequencing reads, how error correction impacts assembly accuracy is largely unknown. Here, we show via use of a simulated dataset, that applying error correction to sequencing reads has significant positive effects on assembly accuracy, by reducing assembly error by nearly 50%, and therefore should be applied to all datasets.Comment: version 3 added PE reads and an empirical datase

    The Oyster River Protocol: a multi-assembler and kmer approach for de novo transcriptome assembly

    No full text
    Characterizing transcriptomes in non-model organisms has resulted in a massive increase in our understanding of biological phenomena. This boon, largely made possible via high-throughput sequencing, means that studies of functional, evolutionary, and population genomics are now being done by hundreds or even thousands of labs around the world. For many, these studies begin with a de novo transcriptome assembly, which is a technically complicated process involving several discrete steps. The Oyster River Protocol (ORP), described here, implements a standardized and benchmarked set of bioinformatic processes, resulting in an assembly with enhanced qualities over other standard assembly methods. Specifically, ORP produced assemblies have higher Detonate and TransRate scores and mapping rates, which is largely a product of the fact that it leverages a multi-assembler and kmer assembly process, thereby bypassing the shortcomings of any one approach. These improvements are important, as previously unassembled transcripts are included in ORP assemblies, resulting in a significant enhancement of the power of downstream analysis. Further, as part of this study, I show that assembly quality is unrelated with the number of reads generated, above 30 million reads. Code Availability: The version controlled open-source code is available at https://github.com/macmanes-lab/Oyster_River_Protocol. Instructions for software installation and use, and other details are available at http://oyster-river-protocol.rtfd.org/

    De novo genome assembly of Geosmithia morbida, the causal agent of thousand cankers disease

    Get PDF
    Geosmithia morbida is a filamentous ascomycete that causes thousand cankers disease in the eastern black walnut tree. This pathogen is commonly found in the western U.S.; however, recently the disease was also detected in several eastern states where the black walnut lumber industry is concentrated. G. morbida is one of two known phytopathogens within the genus Geosmithia, and it is vectored into the host tree via the walnut twig beetle. We present the first de novo draft genome of G. morbida. It is 26.5 Mbp in length and contains less than 1% repetitive elements. The genome possesses an estimated 6,273 genes, 277 of which are predicted to encode proteins with unknown functions. Approximately 31.5% of the proteins in G. morbida are homologous to proteins involved in pathogenicity, and 5.6% of the proteins contain signal peptides that indicate these proteins are secreted. Several studies have investigated the evolution of pathogenicity in pathogens of agricultural crops; forest fungal pathogens are often neglected because research efforts are focused on food crops. G. morbida is one of the few tree phytopathogens to be sequenced, assembled and annotated. The first draft genome of G. morbida serves as a valuable tool for comprehending the underlying molecular and evolutionary mechanisms behind pathogenesis within the Geosmithia genus

    Limited Evidence for Parallel Evolution Among Desert-Adapted Peromyscus Deer Mice

    Get PDF
    Warming climate and increasing desertification urge the identification of genes involved in heat and dehydration tolerance to better inform and target biodiversity conservation efforts. Comparisons among extant desert-adapted species can highlight parallel or convergent patterns of genome evolution through the identification of shared signatures of selection. We generate a chromosome-level genome assembly for the canyon mouse (Peromyscus crinitus) and test for a signature of parallel evolution by comparing signatures of selective sweeps across population-level genomic resequencing data from another congeneric desert specialist (Peromyscus eremicus) and a widely distributed habitat generalist (Peromyscus maniculatus), that may be locally adapted to arid conditions. We identify few shared candidate loci involved in desert adaptation and do not find support for a shared pattern of parallel evolution. Instead, we hypothesize divergent molecular mechanisms of desert adaptation among deer mice, potentially tied to species-specific historical demography, which may limit or enhance adaptation. We identify a number of candidate loci experiencing selective sweeps in the P. crinitus genome that are implicated in osmoregulation (Trypsin, Prostasin) and metabolic tuning (Kallikrein, eIF2-alpha kinase GCN2, APPL1/2), which may be important for accommodating hot and dry environmental conditions

    Variation in pigmentation gene expression is associated with distinct aposematic color morphs in the poison frog Dendrobates auratus

    Get PDF
    Background: Color and pattern phenotypes have clear implications for survival and reproduction in many species. However, the mechanisms that produce this coloration are still poorly characterized, especially at the genomic level. Here we have taken a transcriptomics-based approach to elucidate the underlying genetic mechanisms affecting color and pattern in a highly polytypic poison frog. We sequenced RNA from the skin from four different color morphs during the final stage of metamorphosis and assembled a de novo transcriptome. We then investigated differential gene expression, with an emphasis on examining candidate color genes from other taxa. Results: Overall, we found differential expression of a suite of genes that control melanogenesis, melanocyte differentiation, and melanocyte proliferation (e.g., tyrp1, lef1, leo1, and mitf) as well as several differentially expressed genes involved in purine synthesis and iridophore development (e.g., arfgap1, arfgap2, airc, and gart). Conclusions: Our results provide evidence that several gene networks known to affect color and pattern in vertebrates play a role in color and pattern variation in this species of poison frog

    Differential gene expression and gene variants drive color and pattern development in divergent color morphs of a mimetic poison frog

    Get PDF
    Evolutionary biologists have long investigated the ecological contexts, evolutionary forces, and proximate mechanisms that produce the diversity of animal coloration we see in the natural world. In aposematic species, color and pattern is directly tied to survival and thus understanding the origin of the phenotype has been a focus of both theoretical and empirical inquiry. In order to better understand this diversity, we examined gene expression in skin tissue during development in four different color morphs of the aposematic mimic poison frog, Ranitomeya imitator. We identified a suite of candidate color-related genes a priori and identified the pattern of expression in these genes over time, differences in expression of these genes between the mimetic morphs, and genetic variants that differ between color morphs. We identified several candidate color genes that are differentially expressed over time or across populations, as well as a number of color genes with fixed genetic variants between color morphs. Many of the color genes we discovered in our dataset are involved in the canonical Wnt signaling pathway, including several fixed SNPs between color morphs. Further, many genes in this pathway were differentially expressed at different points in development (e.g., lef1, tyr, tyrp1). Importantly, Wnt signaling pathway genes are overrepresented relative to expression in Xenopus tropicalis. Taken together, this provides evidence that the Wnt signaling pathway is contributing to color pattern production in R. imitator, and is an excellent candidate for producing some of the differences in color pattern between morphs. In addition, we found evidence that sepiapterin reductase is likely important in the production of yellow-green coloration in this adaptive radiation. Finally, two iridophore genes (arfap1, gart) draw a strong parallel to previous work in another dendrobatid, indicating that these genes are also strong candidates for differential color production. We have used high throughput sequencing throughout development to examine the evolution of coloration in a rapid mimetic adaptive radiation and found that these divergent color patterns are likely to be affected by a combination of developmental patterns of gene expression, color morph-specific gene expression, and color morph-specific gene variants.Joyner Open Access Publishing Support Fun

    Is Promiscuity Associated with Enhanced Selection on MHC-DQΞ± in Mice (genus Peromyscus)?

    Get PDF
    Reproductive behavior may play an important role in shaping selection on Major Histocompatibility Complex (MHC) genes. For example, the number of sexual partners that an individual has may affect exposure to sexually transmitted pathogens, with more partners leading to greater exposure and, hence, potentially greater selection for variation at MHC loci. To explore this hypothesis, we examined the strength of selection on exon 2 of the MHC-DQΞ± locus in two species of Peromyscus. While the California mouse (P. californicus) is characterized by lifetime social and genetic monogamy, the deer mouse (P. maniculatus) is socially and genetically promiscuous; consistent with these differences in mating behavior, the diversity of bacteria present within the reproductive tracts of females is significantly greater for P. maniculatus. To test the prediction that more reproductive partners and exposure to a greater range of sexually transmitted pathogens are associated with enhanced diversifying selection on genes responsible for immune function, we compared patterns and levels of diversity at the Class II MHC-DQΞ± locus in sympatric populations of P. maniculatus and P. californicus. Using likelihood based analyses, we show that selection is enhanced in the promiscuous P. maniculatus. This study is the first to compare the strength of selection in wild sympatric rodents with known differences in pathogen milieu
    • …
    corecore