1,052 research outputs found

    Structure, function and diversity of the healthy human microbiome

    Get PDF
    Studies of the human microbiome have revealed that even healthy individuals differ remarkably in the microbes that occupy habitats such as the gut, skin and vagina. Much of this diversity remains unexplained, although diet, environment, host genetics and early microbial exposure have all been implicated. Accordingly, to characterize the ecology of human-associated microbial communities, the Human Microbiome Project has analysed the largest cohort and set of distinct, clinically relevant body habitats so far. We found the diversity and abundance of each habitat’s signature microbes to vary widely even among healthy subjects, with strong niche specialization both within and among individuals. The project encountered an estimated 81–99% of the genera, enzyme families and community configurations occupied by the healthy Western microbiome. Metagenomic carriage of metabolic pathways was stable among individuals despite variation in community structure, and ethnic/racial background proved to be one of the strongest associations of both pathways and microbes with clinical metadata. These results thus delineate the range of structural and functional configurations normal in the microbial communities of a healthy population, enabling future characterization of the epidemiology, ecology and translational applications of the human microbiome

    “Fecal microbiome in epidemiologic studies” - Letter

    Get PDF
    We congratulate Sinha et al. on their recent report (1) comparing fecal sample collection methods for epidemiologic studies of the gut microbiome. These data contribute to the increasing body of literature describing robust methodological frameworks for specimen collection and processing (2, 3). However, their claim that fixation of stool using RNAlater® results in “considerable changes to the microbiome diversity” contrasts with previous findings (2, 3), including those from their earlier reports (4, 5). We have previously demonstrated that self-collected stool stabilized with RNAlater® or other fixatives yields high fidelity and reproducibility in compositional profiling of DNA and RNA from shotgun sequence data, compared to immediately-frozen specimens (3). Additionally, fixation offers several distinct advantages crucial for large-scale population-based studies: a straightforward self-collection procedure; sample stabilization without deep-freezing during shipping, receiving, and processing; and versatility for multiple molecular analyses. The authors’ finding that specimens preserved in RNAlater® had poor correlation with immediately frozen specimens (1) could be explained, for example, by improper fixation resulting from an excess of specimen relative to preservative volume (1–2 g:2.5 ml, compared to the manufacturer-recommended ratio of 1 g:5–10 ml; Thermo Fisher Scientific Inc., Waltham, MA)

    Graphle: Interactive exploration of large, dense graphs

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A wide variety of biological data can be modeled as network structures, including experimental results (e.g. protein-protein interactions), computational predictions (e.g. functional interaction networks), or curated structures (e.g. the Gene Ontology). While several tools exist for visualizing large graphs at a global level or small graphs in detail, previous systems have generally not allowed interactive analysis of dense networks containing thousands of vertices at a level of detail useful for biologists. Investigators often wish to explore specific portions of such networks from a detailed, gene-specific perspective, and balancing this requirement with the networks' large size, complex structure, and rich metadata is a substantial computational challenge.</p> <p>Results</p> <p>Graphle is an online interface to large collections of arbitrary undirected, weighted graphs, each possibly containing tens of thousands of vertices (e.g. genes) and hundreds of millions of edges (e.g. interactions). These are stored on a centralized server and accessed efficiently through an interactive Java applet. The Graphle applet allows a user to examine specific portions of a graph, retrieving the relevant neighborhood around a set of query vertices (genes). This neighborhood can then be refined and modified interactively, and the results can be saved either as publication-quality images or as raw data for further analysis. The Graphle web site currently includes several hundred biological networks representing predicted functional relationships from three heterogeneous data integration systems: <it>S. cerevisiae </it>data from bioPIXIE, <it>E. coli </it>data using MEFIT, and <it>H. sapiens </it>data from HEFalMp.</p> <p>Conclusions</p> <p>Graphle serves as a search and visualization engine for biological networks, which can be managed locally (simplifying collaborative data sharing) and investigated remotely. The Graphle framework is freely downloadable and easily installed on new servers, allowing any lab to quickly set up a Graphle site from which their own biological network data can be shared online.</p

    BAYESIAN NONPARAMETRIC CROSS-STUDY VALIDATION OF PREDICTION METHODS

    Full text link
    We consider comparisons of statistical learning algorithms using multiple data sets, via leave-one-in cross-study validation: each of the algorithms is trained on one data set; the resulting model is then validated on each remaining data set. This poses two statistical challenges that need to be addressed simultaneously. The first is the assessment of study heterogeneity, with the aim of identifying a subset of studies within which algorithm comparisons can be reliably carried out. The second is the comparison of algorithms using the ensemble of data sets. We address both problems by integrating clustering and model comparison. We formulate a Bayesian model for the array of cross-study validation statistics, which defines clusters of studies with similar properties and provides the basis for meaningful algorithm comparison in the presence of study heterogeneity. We illustrate our approach through simulations involving studies with varying severity of systematic errors, and in the context of medical prognosis for patients diagnosed with cancer, using high-throughput measurements of the transcriptional activity of the tumor’s genes

    Passing Messages between Biological Networks to Refine Predicted Interactions

    Get PDF
    Regulatory network reconstruction is a fundamental problem in computational biology. There are significant limitations to such reconstruction using individual datasets, and increasingly people attempt to construct networks using multiple, independent datasets obtained from complementary sources, but methods for this integration are lacking. We developed PANDA (Passing Attributes between Networks for Data Assimilation), a message-passing model using multiple sources of information to predict regulatory relationships, and used it to integrate protein-protein interaction, gene expression, and sequence motif data to reconstruct genome-wide, condition-specific regulatory networks in yeast as a model. The resulting networks were not only more accurate than those produced using individual data sets and other existing methods, but they also captured information regarding specific biological mechanisms and pathways that were missed using other methodologies. PANDA is scalable to higher eukaryotes, applicable to specific tissue or cell type data and conceptually generalizable to include a variety of regulatory, interaction, expression, and other genome-scale data. An implementation of the PANDA algorithm is available at www.sourceforge.net/projects/panda-net
    corecore