20 research outputs found

    Plantagora: Modeling Whole Genome Sequencing and Assembly of Plant Genomes

    Get PDF
    BACKGROUND: Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. METHODOLOGY/PRINCIPAL FINDINGS: For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. CONCLUSIONS/SIGNIFICANCE: Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly further

    Comparison of the contributions of the nuclear and cytoplasmic compartments to global gene expression in human cells

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the most general sense, studies involving global analysis of gene expression aim to provide a comprehensive catalog of the components involved in the production of recognizable cellular phenotypes. These studies are often limited by the available technologies. One technology, based on microarrays, categorizes gene expression in terms of the abundance of RNA transcripts, and typically employs RNA prepared from whole cells, where cytoplasmic RNA predominates.</p> <p>Results</p> <p>Using microarrays comprising oligonucleotide probes that represent either protein-coding transcripts or microRNAs (miRNA), we have studied global transcript accumulation patterns for the HepG2 (human hepatoma) cell line. Through subdividing the total pool of RNA transcripts into samples from nuclei, the cytoplasm, and whole cells, we determined the degree of correlation of these patterns across these different subcellular locations. The transcript and miRNA abundance patterns for the three RNA fractions were largely similar, but with some exceptions: nuclear RNA samples were enriched with respect to the cytoplasm in transcripts encoding proteins associated with specific nuclear functions, such as the cell cycle, mitosis, and transcription. The cytoplasmic RNA fraction also was enriched, when compared to the nucleus, in transcripts for proteins related to specific nuclear functions, including the cell cycle, DNA replication, and DNA repair. Some transcripts related to the ubiquitin cycle, and transcripts for various membrane proteins were sorted into either the nuclear or cytoplasmic fractions.</p> <p>Conclusion</p> <p>Enrichment or compartmentalization of cell cycle and ubiquitin cycle transcripts within the nucleus may be related to the regulation of their expression, by preventing their translation to proteins. In this way, these cellular functions may be tightly controlled by regulating the release of mRNA from the nucleus and thereby the expression of key rate limiting steps in these pathways. Many miRNA precursors were also enriched in the nuclear samples, with significantly fewer being enriched in the cytoplasm. Studies of mRNA localization will help to clarify the roles RNA processing and transport play in the regulation of cellular function.</p

    Persistent Infection and Promiscuous Recombination of Multiple Genotypes of an RNA Virus within a Single Host Generate Extensive Diversity

    Get PDF
    Recombination and reassortment of viral genomes are major processes contributing to the creation of new, emerging viruses. These processes are especially significant in long-term persistent infections where multiple viral genotypes co-replicate in a single host, generating abundant genotypic variants, some of which may possess novel host-colonizing and pathogenicity traits. In some plants, successive vegetative propagation of infected tissues and introduction of new genotypes of a virus by vector transmission allows for viral populations to increase in complexity for hundreds of years allowing co-replication and subsequent recombination of the multiple viral genotypes. Using a resequencing microarray, we examined a persistent infection by a Citrus tristeza virus (CTV) complex in citrus, a vegetatively propagated, globally important fruit crop, and found that the complex comprised three major and a number of minor genotypes. Subsequent deep sequencing analysis of the viral population confirmed the presence of the three major CTV genotypes and, in addition, revealed that the minor genotypes consisted of an extraordinarily large number of genetic variants generated by promiscuous recombination between the major genotypes. Further analysis provided evidence that some of the recombinants underwent subsequent divergence, further increasing the genotypic complexity. These data demonstrate that persistent infection of multiple viral genotypes within a host organism is sufficient to drive the large-scale production of viral genetic variants that may evolve into new and emerging viruses

    Production of N-sulfated polysaccharides using yeast-expressed N-deacetylase/N-sulfotransferase-1 (NDST-1)

    No full text
    (NDST-1) is a critical enzyme involved in heparan sulfate/ heparin biosynthesis. This dual-function enzyme modifies the GlcNAc-GlcA disaccharide repeating sugar backbone to make N-sulfated heparosan. N-sulfation is an absolute requirement for the subsequent epimerization and O-sulfation steps in heparan sulfate/heparin biosynthesis. We have expressed rat liver (r) NDST-1 in Saccharomyces cerevisiae as a soluble protein. The yeast-expressed enzyme has both N-deacetylase and N-sulfotransferase activities. N-acetyl heparosan, isolated from Escherichia coli K5 polysaccharide, de-N-sulfated heparin and completely desulfated N-acety-lated heparan sulfate are all good substrates for the rNDST-1. However, N-desulfated, N-acetylated heparin is a poor substrate. The rNDST-1 was partially purified on heparin Sepharose CL-6B. Purified rNDST-1 requires Mn2þ for its enzymatic activity, can utilize PAPS regener-ated in vitro by the PAPS cycle (PAP plus para-nitrophenyl-sulfate in the presence of arylsulfotransferase IV), and with the addition of exogenous PAPS is capable of producing 60–65 % N-sulfated heparosan from E. coli K5 polysacchar-ide or Pasteurella multocida polysaccharide. Key words: heparan sulfate/K5 polysaccharide/NDST-1/ PAPS cycle/yeas

    Comparison of the contributions of the nuclear and cytoplasmic compartments to global gene expression in human cells-0

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Comparison of the contributions of the nuclear and cytoplasmic compartments to global gene expression in human cells"</p><p>http://www.biomedcentral.com/1471-2164/8/340</p><p>BMC Genomics 2007;8():340-340.</p><p>Published online 25 Sep 2007</p><p>PMCID:PMC2048942.</p><p></p>alculated for the human genomic microarray data by ANOVA and by selection of those with a FDR less than 0.05. These two lists were submitted independently for analysis by GOToolbox [27] to determine the cell component annotation of the transcripts, and to determine whether some of the annotation categories were overrepresented on the lists, using the hypergeometric test with Benjamini and Hochberg FDR calculation. Some of the categories with strong representation among the transcripts are presented here. The transcripts that were placed in each category are identified by gene name or abbreviated TREMBL identifier and color-coded to indicate the ratio of the log, mean, normalized intensity values of the nuclear sample over the cytoplasmic sample. Where the lists for nuclear or cytoplasmic transcripts show overrepresentation in a GO category, the FDR is provided
    corecore