27 research outputs found
Genome modeling system: A knowledge management platform for genomics
In this work, we present the Genome Modeling System (GMS), an analysis information management system capable of executing automated genome analysis pipelines at a massive scale. The GMS framework provides detailed tracking of samples and data coupled with reliable and repeatable analysis pipelines. The GMS also serves as a platform for bioinformatics development, allowing a large team to collaborate on data analysis, or an individual researcher to leverage the work of others effectively within its data management system. Rather than separating ad-hoc analysis from rigorous, reproducible pipelines, the GMS promotes systematic integration between the two. As a demonstration of the GMS, we performed an integrated analysis of whole genome, exome and transcriptome sequencing data from a breast cancer cell line (HCC1395) and matched lymphoblastoid line (HCC1395BL). These data are available for users to test the software, complete tutorials and develop novel GMS pipeline configurations. The GMS is available at https://github.com/genome/gms
A framework for human microbiome research
A variety of microbial communities and their genes (the microbiome) exist throughout the human body, with fundamental roles in human health and disease. The National Institutes of Health (NIH)-funded Human Microbiome Project Consortium has established a population-scale framework to develop metagenomic protocols, resulting in a broad range of quality-controlled resources and data including standardized methods for creating, processing and interpreting distinct types of high-throughput metagenomic data available to the scientific community. Here we present resources from a population of 242 healthy adults sampled at 15 or 18 body sites up to three times, which have generated 5,177 microbial taxonomic profiles from 16S ribosomal RNA genes and over 3.5 terabases of metagenomic sequence so far. In parallel, approximately 800 reference strains isolated from the human body have been sequenced. Collectively, these data represent the largest resource describing the abundance and variety of the human microbiome, while providing a framework for current and future studies
Structure, function and diversity of the healthy human microbiome
Author Posting. © The Authors, 2012. This article is posted here by permission of Nature Publishing Group. The definitive version was published in Nature 486 (2012): 207-214, doi:10.1038/nature11234.Studies of the human microbiome have revealed that even healthy individuals differ remarkably in the microbes that occupy habitats such as the gut, skin and vagina. Much of this diversity remains unexplained, although diet, environment, host genetics and early microbial exposure have all been implicated. Accordingly, to characterize the ecology of human-associated microbial communities, the Human Microbiome Project has analysed the largest cohort and set of distinct, clinically relevant body habitats so far. We found the diversity and abundance of each habitat’s signature microbes to vary widely even among healthy subjects, with strong niche specialization both within and among individuals. The project encountered an estimated 81–99% of the genera, enzyme families and community configurations occupied by the healthy Western microbiome. Metagenomic carriage of metabolic pathways was stable among individuals despite variation in community structure, and ethnic/racial background proved to be one of the strongest associations of both pathways and microbes with clinical metadata. These results thus delineate the range of structural and functional configurations normal in the microbial communities of a healthy population, enabling future characterization of the epidemiology, ecology and translational applications of the human microbiome.This research was supported in
part by National Institutes of Health grants U54HG004969 to B.W.B.; U54HG003273
to R.A.G.; U54HG004973 to R.A.G., S.K.H. and J.F.P.; U54HG003067 to E.S.Lander;
U54AI084844 to K.E.N.; N01AI30071 to R.L.Strausberg; U54HG004968 to G.M.W.;
U01HG004866 to O.R.W.; U54HG003079 to R.K.W.; R01HG005969 to C.H.;
R01HG004872 to R.K.; R01HG004885 to M.P.; R01HG005975 to P.D.S.;
R01HG004908 to Y.Y.; R01HG004900 to M.K.Cho and P. Sankar; R01HG005171 to
D.E.H.; R01HG004853 to A.L.M.; R01HG004856 to R.R.; R01HG004877 to R.R.S. and
R.F.; R01HG005172 to P. Spicer.; R01HG004857 to M.P.; R01HG004906 to T.M.S.;
R21HG005811 to E.A.V.; M.J.B. was supported by UH2AR057506; G.A.B. was
supported by UH2AI083263 and UH3AI083263 (G.A.B., C. N. Cornelissen, L. K. Eaves
and J. F. Strauss); S.M.H. was supported by UH3DK083993 (V. B. Young, E. B. Chang,
F. Meyer, T. M. S., M. L. Sogin, J. M. Tiedje); K.P.R. was supported by UH2DK083990 (J.
V.); J.A.S. and H.H.K. were supported by UH2AR057504 and UH3AR057504 (J.A.S.);
DP2OD001500 to K.M.A.; N01HG62088 to the Coriell Institute for Medical Research;
U01DE016937 to F.E.D.; S.K.H. was supported by RC1DE0202098 and
R01DE021574 (S.K.H. and H. Li); J.I. was supported by R21CA139193 (J.I. and
D. S. Michaud); K.P.L. was supported by P30DE020751 (D. J. Smith); Army Research
Office grant W911NF-11-1-0473 to C.H.; National Science Foundation grants NSF
DBI-1053486 to C.H. and NSF IIS-0812111 to M.P.; The Office of Science of the US
Department of Energy under Contract No. DE-AC02-05CH11231 for P.S. C.; LANL
Laboratory-Directed Research and Development grant 20100034DR and the US
Defense Threat Reduction Agency grants B104153I and B084531I to P.S.C.; Research
Foundation - Flanders (FWO) grant to K.F. and J.Raes; R.K. is an HHMI Early Career
Scientist; Gordon&BettyMoore Foundation funding and institutional funding fromthe
J. David Gladstone Institutes to K.S.P.; A.M.S. was supported by fellowships provided by
the Rackham Graduate School and the NIH Molecular Mechanisms in Microbial
Pathogenesis Training Grant T32AI007528; a Crohn’s and Colitis Foundation of
Canada Grant in Aid of Research to E.A.V.; 2010 IBM Faculty Award to K.C.W.; analysis
of the HMPdata was performed using National Energy Research Scientific Computing
resources, the BluBioU Computational Resource at Rice University
Finishing the euchromatic sequence of the human genome
The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
A Journey into the City. Migrant Workers' Relation with the Urban Space and Struggle for Existence in Xu Zechen's Early Jingpiao Fiction
In contemporary China, rural-urban migrants constitute a new urban subject with entirely new identity-related issues. This study aims at demonstrating how literature can be a valid field in investigating such evolving subjectivities, through an analysis of Xu Zechen’s early novellas depicting migrants’ vicissitudes in Beijing. Combining a close reading of the texts and a review of the main social problems characterising rural-urban migration in China, this paper focuses on the representation of the identity crisis within the migrant self in Xu’s stories, taking into account the network of meanings employed by the writer to signify the objective and subjective tension between the city and the countryside
Genome remodelling in a basal-like breast cancer metastasis and xenograft
Massively parallel DNA sequencing technologies provide an unprecedented ability to screen entire genomes for genetic changes associated with tumour progression. Here we describe the genomic analyses of four DNA samples from an African-American patient with basal-like breast cancer: peripheral blood, the primary tumour, a brain metastasis and a xenograft derived from the primary tumour. The metastasis contained two de novo mutations and a large deletion not present in the primary tumour, and was significantly enriched for 20 shared mutations. The xenograft retained all primary tumour mutations and displayed a mutation enrichment pattern that resembled the metastasis. Two overlapping large deletions, encompassing CTNNA1, were present in all three tumour samples. The differential mutation frequencies and structural variation patterns in metastasis and xenograft compared with the primary tumour indicate that secondary tumours may arise from a minority of cells within the primary tumour