9 research outputs found

    Validation of picogram- and femtogram-input DNA libraries for microscale metagenomics

    Full text link
    © 2016 Rinke et al. High-throughput sequencing libraries are typically limited by the requirement for nanograms to micrograms of input DNA. This bottleneck impedes the microscale analysis of ecosystems and the exploration of low biomass samples. Current methods for amplifying environmental DNA to bypass this bottleneck introduce considerable bias into metagenomic profiles. Here we describe and validate a simple modification of the Illumina Nextera XT DNA library preparation kit which allows creation of shotgun libraries from sub-nanogram amounts of input DNA. Community composition was reproducible down to 100 fg of input DNA based on analysis of a mock community comprising 54 phylogenetically diverse Bacteria and Archaea. The main technical issues with the low input libraries were a greater potential for contamination, limited DNA complexity which has a direct effect on assembly and binning, and an associated higher percentage of read duplicates. We recommend a lower limit of 1 pg (~100-1,000 microbial cells) to ensure community composition fidelity, and the inclusion of negative controls to identify reagent-specific contaminants. Applying the approach to marine surface water, pronounced differences were observed between bacterial community profiles of microliter volume samples, which we attribute to biological variation. This result is consistent with expected microscale patchiness in marine communities. We thus envision that our benchmarked, slightly modified low input DNA protocol will be beneficial for microscale and low biomass metagenomics

    Functional implications of the emergence of alternative splicing in hnRNP A/B transcripts

    Get PDF
    The heterogeneous nuclear ribonucleoproteins (hnRNPs) A/B are a family of RNA-binding proteins that participate in various aspects of nucleic acid metabolism, including mRNA trafficking, telomere maintenance, and splicing. They are both regulators and targets of alternative splicing, and the patterns of alternative splicing of their transcripts have diverged between paralogs and between orthologs in different species. Surprisingly, the extent of this splicing variation and its implications for post-transcriptional regulation have remained largely unexplored. Here, we conducted a detailed analysis of hnRNP A/B sequences and expression patterns across six vertebrates. Alternative exons emerged via the introduction of new splice sites, changes in the strengths of existing splice sites, and the accumulation of auxiliary splicing regulatory motifs. Observed isoform expression patterns could be attributed to the frequency and strength of cis-elements. We found a trend toward increased splicing variation in mammals and identified novel alternatively spliced isoforms in human and chicken. Pulldown and translational assays demonstrated that the inclusion of alternative exons altered the affinity of hnRNP A/B proteins for their cognate nucleic acids and modified protein expression levels. As the hnRNPs A/B regulate several key steps in mRNA processing, the involvement of diverse hnRNP isoforms in multiple cellular contexts and species implies concomitant differences in the transcriptional output of these systems. We conclude that the emergence of alternative splicing in the hnRNPs A/B has contributed to the diversification of their roles in the regulation of alternative splicing and has thus added an unexpected layer of regulatory complexity to transcription in vertebrates

    uPEPperoni: an online tool for upstream open reading frame location and analysis of transcript conservation

    Get PDF
    Background: Several small open reading frames located within the 5' untranslated regions of mRNAs have recently been shown to be translated. In humans, about 50% of mRNAs contain at least one upstream open reading frame representing a large resource of coding potential. We propose that some upstream open reading frames encode peptides that are functional and contribute to proteome complexity in humans and other organisms. We use the term uPEPs to describe peptides encoded by upstream open reading frames

    Complex evolutionary relationships among four classes of modular RNA-binding splicing regulators in eukaryotes: the hnRNP, SR, ELAV-Like and CELF proteins

    No full text
    Alternative RNA splicing in multicellular organisms is regulated by a large group of proteins of mainly unknown origin. To predict the functions of these proteins, classification of their domains at the sequence and structural level is necessary. We have focused on four groups of splicing regulators, the heterogeneous nuclear ribonucleoprotein (hnRNP), serine–arginine (SR), embryonic lethal, abnormal vision (ELAV)-like, and CUG-BP and ETR-like factor (CELF) proteins, that show increasing diversity among metazoa. Sequence and phylogenetic analyses were used to obtain a broader understanding of their evolutionary relationships. Surprisingly, when we characterised sequence similarities across full-length sequences and conserved domains of ten metazoan species, we found some hnRNPs were more closely related to SR, ELAV-like and CELF proteins than to other hnRNPs. Phylogenetic analyses and the distribution of the RRM domains suggest that these proteins diversified before the last common ancestor of the metazoans studied here through domain acquisition and duplication to create genes of mixed evolutionary origin. We propose that these proteins were derived independently rather than through the expansion of a single protein family. Our results highlight inconsistencies in the current classification system for these regulators, which does not adequately reflect their evolutionary relationships, and suggests that a domain-based classification scheme may have more utility

    Sequencing and assembly of low copy and genic regions of isolated Triticum aestivum chromosome arm 7DS

    Get PDF
    The genome of bread wheat (Triticum aestivum) is predicted to be greater than 16 Gbp in size and consist predominantly of repetitive elements, making the sequencing and assembly of this genome a major challenge. We have reduced genome sequence complexity by isolating chromosome arm 7DS and applied second-generation technology and appropriate algorithmic analysis to sequence and assemble low copy and genic regions of this chromosome arm. The assembly represents approximately 40% of the chromosome arm and all known 7DS genes. Comparison of the 7DS assembly with the sequenced genomes of rice (Oryza sativa) and Brachypodium distachyon identified large regions of conservation. The syntenic relationship between wheat, B. distachyon and O. sativa, along with available genetic mapping data, has been used to produce an annotated draft 7DS syntenic build, which is publicly available at http://www.wheatgenome.info. Our results suggest that the sequencing of isolated chromosome arms can provide valuable information of the gene content of wheat and is a step towards whole-genome sequencing and variation discovery in this important crop.Paul J. Berkman, Adam Skarshewski, Michał T. Lorenc, Kaitao Lai, Chris Duran, Edmund Y.S. Ling, Jiri Stiller, Lars Smits, Michael Imelfort, Sahana Manoli, Megan McKenzie, Marie Kubalákova, Hana Simková, Jacqueline Batley, Delphine Fleury, Jaroslav Doležel and David Edward

    Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes

    No full text
    Reference genomes are required to understand the diverse roles of microorganisms in ecology, evolution, human and animal health, but most species remain uncultured. Here we present a sequence composition-independent approach to recover high-quality microbial genomes from deeply sequenced metagenomes. Multiple metagenomes of the same community, which differ in relative population abundances, were used to assemble 31 bacterial genomes, including rare (< 1% relative abundance) species, from an activated sludge bioreactor. Twelve genomes were assembled into complete or near-complete chromosomes. Four belong to the candidate bacterial phylum TM7 and represent the most complete genomes for this phylum to date (relative abundances, 0.06-1.58%). Reanalysis of published metagenomes reveals that differential coverage binning facilitates recovery of more complete and higher fidelity genome bins than other currently used methods, which are primarily based on sequence composition. This approach will be an important addition to the standard metagenome toolbox and greatly improve access to genomes of uncultured microorganisms
    corecore