Search CORE

26 research outputs found

CAMERA: A Community Resource for Metagenomics

Author: Angly
DeLong
Kannan
Larry Smarr
Marvin Frazier
Paul Gilna
Pruitt
Rekha Seshadri
Rusch
Saul A Kravitz
Smarr
Taesombut
Yooseph
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

The CAMERA (Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis) community database for metagenomic data deposition is an important first step in developing methods for monitoring microbial communities

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

The Diploid Genome Sequence of an Individual Human

Presented here is a genome sequence of an individual human. It was produced from ∼32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel) included 3,213,401 single nucleotide polymorphisms (SNPs), 53,823 block substitutions (2–206 bp), 292,102 heterozygous insertion/deletion events (indels)(1–571 bp), 559,473 homozygous indels (1–82,711 bp), 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Diposit Digital de la Universitat de Barcelona

ScholarBank@NUS

The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific

The world's oceans contain a complex mixture of micro-organisms that are for the most part, uncharacterized both genetically and biochemically. We report here a metagenomic study of the marine planktonic microbiota in which surface (mostly marine) water samples were analyzed as part of the Sorcerer II Global Ocean Sampling expedition. These samples, collected across a several-thousand km transect from the North Atlantic through the Panama Canal and ending in the South Pacific yielded an extensive dataset consisting of 7.7 million sequencing reads (6.3 billion bp). Though a few major microbial clades dominate the planktonic marine niche, the dataset contains great diversity with 85% of the assembled sequence and 57% of the unassembled data being unique at a 98% sequence identity cutoff. Using the metadata associated with each sample and sequencing library, we developed new comparative genomic and assembly methods. One comparative genomic method, termed “fragment recruitment,” addressed questions of genome structure, evolution, and taxonomic or phylogenetic diversity, as well as the biochemical diversity of genes and gene families. A second method, termed “extreme assembly,” made possible the assembly and reconstruction of large segments of abundant but clearly nonclonal organisms. Within all abundant populations analyzed, we found extensive intra-ribotype diversity in several forms: (1) extensive sequence variation within orthologous regions throughout a given genome; despite coverage of individual ribotypes approaching 500-fold, most individual sequencing reads are unique; (2) numerous changes in gene content some with direct adaptive implications; and (3) hypervariable genomic islands that are too variable to assemble. The intra-ribotype diversity is organized into genetically isolated populations that have overlapping but independent distributions, implying distinct environmental preference. We present novel methods for measuring the genomic similarity between metagenomic samples and show how they may be grouped into several community types. Specific functional adaptations can be identified both within individual ribotypes and across the entire community, including proteorhodopsin spectral tuning and the presence or absence of the phosphate-binding gene PstS

Public Library of Science (PLOS)

Crossref

Repositorio Institucional de la Universidad de Costa Rica

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Recommended from our members

CAMERA: A Community Resource for Metagenomics

Author: Frazier Marvin
Gilna Paul
Kravitz Saul A
Seshadri Rekha
Smarr Larry
Publication venue: eScholarship, University of California
Publication date: 01/01/2007
Field of study

Microbes are responsible for most of the chemical transformations that are crucial to sustaining life on Earth. Their ability to inhabit almost any environmental niche suggests that they possess an incredible diversity of physiological capabilities. However, we have little to no information on a majority of the millions of microbial species that are predicted to exist, mainly because of our inability to culture them in the laboratory.A growing discipline called metagenomics allows us to study these uncultured organisms by deciphering their genetic information from DNA that is extracted directly from their environment, thus effectively bypassing the laboratory culture step. Metagenomics allows us to address the questions “who's there?”, “what are they doing?”, and “how are they doing it?”, offering insights into the evolutionary history as well as previously unrecognized physiological abilities of uncultured communities

eScholarship - University of California

Schematic of Intended Core Functions of the CAMERA Project

Author: Larry Smarr (379158)
Marvin Frazier (38235)
Paul Gilna (53476)
Rekha Seshadri (56912)
Saul A Kravitz (79045)
Publication venue
Publication date
Field of study

CBD, Convention on Biological Diversity.</p

FigShare

CAMERA Fragment Recruitment Viewer

Author: Larry Smarr (379158)
Marvin Frazier (38235)
Paul Gilna (53476)
Rekha Seshadri (56912)
Saul A Kravitz (79045)
Publication venue
Publication date
Field of study

This tool graphically displays the results of a BLASTN sequence comparison of an available microbial genome against selected sequence read datasets. The example shown displays the abundance and distribution of Synechococcus spp. genome sequence in the selected sampling sites. The Synechococcus spp. genome coordinates are shown on the x-axis, while the y-axis shows the percent identity scores of the alignment to the selected Sargasso Sea and GOS sequence reads. The viewer incorporates metadata associated with the reads, allowing a user to quickly identify data of interest for further examination. The utility of the plot is to examine the biogeography and genomic variation of abundant microbes when a close reference genome exists.</p

FigShare

2196 A Whole-Genome Assembly of Drosophila

Author: All A. Bolanos
Art L. Delcher
Clark M. Mobarry
Dan P. Fasulo
Eric L. Anson
Eugene W. Myers
Gerald M. Rubin
Granger G. Sutton
Ian M. Dew
J. Craig Venter
Karin A. Remington
Knut H. J. Reinert
Mark D. Adams
Michael J. Flanigan
Qing Zhang
Saul A. Kravitz
Xiangqun Zheng
Publication venue
Publication date
Field of study

We report on the quality of a whole-genome assembly of Drosophila melanogaster and the nature of the computer algorithms that accomplished it. Three independent external data sources essentially agree with and support the assembly’s sequence and ordering of contigs across the euchromatic portion of the genome. In addition, there are isolated contigs that we believe represent nonrepetitive pockets within the heterochromatin of the centromeres. Comparison with a previously sequenced 2.9megabase region indicates that sequencing accuracy within nonrepetitive segments is greater than 99.99 % without manual curation. As such, this initial reconstruction of the Drosophila sequence should be of substantial value to the scientific community. The primary obstacle to determining the sequence of a very large genome is that, with current technology, one can directly determine the sequence of at most a thousan

CiteSeerX

The Diploid Genome Sequence of an Individual Human

Author: Abril Ferrando Josep Francesc, 1970-
Axelrod Nelson
Bafna Vineet
Busam Dana A.
Denisov Gennady
Feuk Lars
Halpern Aaron L.
Huang Jiaqi
Kirkness Ewen F.
Kravitz Saul A.
Levy Samuel
Lin Yuan
Macdonald Jeffrey R.
Ng Pauline C.
Pang Andy Wing Chun
Shago Mary
Stockwell Timothy B.
Sutton Granger
Tsiamouri Alexia
Walenz Brian P.
Publication venue: Public Library of Science (PLoS)
Publication date
Field of study

Presented here is a genome sequence of an individual human. It was produced from ~32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel) included 3,213,401 single nucleotide polymorphisms (SNPs), 53,823 block substitutions (2-206 bp), 292,102 heterozygous insertion/deletion events (indels)(1-571 bp), 559,473 homozygous indels (1-82,711 bp), 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information

RECERCAT