4,809 research outputs found

    Mixed membership stochastic blockmodels

    Full text link
    Observations consisting of measurements on relationships for pairs of objects arise in many settings, such as protein interaction and gene regulatory networks, collections of author-recipient email, and social networks. Analyzing such data with probabilisic models can be delicate because the simple exchangeability assumptions underlying many boilerplate models no longer hold. In this paper, we describe a latent variable model of such data called the mixed membership stochastic blockmodel. This model extends blockmodels for relational data to ones which capture mixed membership latent relational structure, thus providing an object-specific low-dimensional representation. We develop a general variational inference algorithm for fast approximate posterior inference. We explore applications to social and protein interaction networks.Comment: 46 pages, 14 figures, 3 table

    MetAssign: probabilistic annotation of metabolites from LC–MS data using a Bayesian clustering approach

    Get PDF
    Motivation: The use of liquid chromatography coupled to mass spectrometry (LC–MS) has enabled the high-throughput profiling of the metabolite composition of biological samples. However, the large amount of data obtained can be difficult to analyse and often requires computational processing to understand which metabolites are present in a sample. This paper looks at the dual problem of annotating peaks in a sample with a metabolite, together with putatively annotating whether a metabolite is present in the sample. The starting point of the approach is a Bayesian clustering of peaks into groups, each corresponding to putative adducts and isotopes of a single metabolite.<p></p> Results: The Bayesian modelling introduced here combines information from the mass-to-charge ratio, retention time and intensity of each peak, together with a model of the inter-peak dependency structure, to increase the accuracy of peak annotation. The results inherently contain a quantitative estimate of confidence in the peak annotations and allow an accurate trade off between precision and recall. Extensive validation experiments using authentic chemical standards show that this system is able to produce more accurate putative identifications than other state-of-the-art systems, while at the same time giving a probabilistic measure of confidence in the annotations.<p></p> Availability: The software has been implemented as part of the mzMatch metabolomics analysis pipeline, which is available for download at http://mzmatch.sourceforge.net/

    A word of caution about biological inference - Revisiting cysteine covalent state predictions

    Get PDF
    The success of methods for predicting the redox state of cysteine residues from the sequence environment seemed to validate the basic assumption that this state is mainly determined locally. However, the accuracy of predictions on randomized sequences or of non-cysteine residues remained high, suggesting that these predictions rather capture global features of proteins such as subcellular localization, which depends on composition. This illustrates that even high prediction accuracy is insufficient to validate implicit assumptions about a biological phenomenon. Correctly identifying the relevant underlying biochemical reasons for the success of a method is essential to gain proper biological insights and develop more accurate and novel bioinformatics tools. 2014 The Authors. Published by Elsevier B.V. on behalf of the Federation of European Biochemical Societies. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/3.0/)

    Decreased microbial co-occurrence network stability and SCFA receptor level correlates with obesity in African-origin women.

    Get PDF
    We compared the gut microbial populations in 100 women, from rural Ghana and urban US [50% lean (BMI < 25 kg/m2) and 50% obese (BMI ≥ 30 kg/m2)] to examine the ecological co-occurrence network topology of the gut microbiota as well as the relationship of short chain fatty acids (SCFAs) with obesity. Ghanaians consumed significantly more dietary fiber, had greater microbial alpha-diversity, different beta-diversity, and had a greater concentration of total fecal SCFAs (p-value < 0.002). Lean Ghanaians had significantly greater network density, connectivity and stability than either obese Ghanaians, or lean and obese US participants (false discovery rate (FDR) corrected p-value ≤ 0.01). Bacteroides uniformis was significantly more abundant in lean women, irrespective of country (FDR corrected p < 0.001), while lean Ghanaians had a significantly greater proportion of Ruminococcus callidus, Prevotella copri, and Escherichia coli, and smaller proportions of Lachnospiraceae, Bacteroides and Parabacteroides. Lean Ghanaians had a significantly greater abundance of predicted microbial genes that catalyzed the production of butyric acid via the fermentation of pyruvate or branched amino-acids, while obese Ghanaians and US women (irrespective of BMI) had a significantly greater abundance of predicted microbial genes that encoded for enzymes associated with the fermentation of amino-acids such as alanine, aspartate, lysine and glutamate. Similar to lean Ghanaian women, mice humanized with stool from the lean Ghanaian participant had a significantly lower abundance of family Lachnospiraceae and genus Bacteroides and Parabacteroides, and were resistant to obesity following 6-weeks of high fat feeding (p-value < 0.01). Obesity-resistant mice also showed increased intestinal transcriptional expression of the free fatty acid (Ffa) receptor Ffa2, in spite of similar fecal SCFAs concentrations. We demonstrate that the association between obesity resistance and increased predicted ecological connectivity and stability of the lean Ghanaian microbiota, as well as increased local SCFA receptor level, provides evidence of the importance of robust gut ecologic network in obesity

    Integrative modelling of cellular assemblies

    Get PDF
    A wide variety of experimental techniques can be used for understanding the precise molecular mechanisms underlying the activities of cellular assemblies. The inherent limitations of a single experimental technique often requires integration of data from complementary approaches to gain sufficient insights into the assembly structure and function. Here, we review popular computational approaches for integrative modelling of cellular assemblies, including protein complexes and genomic assemblies. We provide recent examples of integrative models generated for such assemblies by different experimental techniques, especially including data from 3D electron microscopy (3D-EM) and chromosome conformation capture experiments, respectively. We highlight general concepts in integrative modelling and discuss the need for careful formulation and merging of different types of information

    The role of salt bridges, charge density, and subunit flexibility in determining disassembly routes of protein complexes

    Get PDF
    Mass spectrometry can be used to characterize multiprotein complexes, defining their subunit stoichiometry and composition following solution disruption and collision-induced dissociation (CID). While CID of protein complexes in the gas phase typically results in the dissociation of unfolded subunits, a second atypical route is possible wherein compact subunits or subcomplexes are ejected without unfolding. Because tertiary structure and subunit interactions may be retained, this is the preferred route for structural investigations. How can we influence which pathway is adopted? By studying properties of a series of homomeric and heteromeric protein complexes and varying their overall charge in solution, we found that low subunit flexibility, higher charge densities, fewer salt bridges, and smaller interfaces are likely to be involved in promoting dissociation routes without unfolding. Manipulating the charge on a protein complex therefore enables us to direct dissociation through structurally informative pathways that mimic those followed in solution

    Small molecule-mediated targeting of microRNAs for drug discovery: experiments, computational techniques, and disease implications

    Get PDF
    Small molecules have been providing medical breakthroughs for human diseases for more than a century. Recently, identifying small molecule inhibitors that target microRNAs (miRNAs) has gained importance, despite the challenges posed by labour-intensive screening experiments and the significant efforts required for medicinal chemistry optimization. Numerous experimentally-verified cases have demonstrated the potential of miRNA-targeted small molecule inhibitors for disease treatment. This new approach is grounded in their posttranscriptional regulation of the expression of disease-associated genes. Reversing dysregulated gene expression using this mechanism may help control dysfunctional pathways. Furthermore, the ongoing improvement of algorithms has allowed for the integration of computational strategies built on top of laboratory-based data, facilitating a more precise and rational design and discovery of lead compounds. To complement the use of extensive pharmacogenomics data in prioritising potential drugs, our previous work introduced a computational approach based on only molecular sequences. Moreover, various computational tools for predicting molecular interactions in biological networks using similarity-based inference techniques have been accumulated in established studies. However, there are a limited number of comprehensive reviews covering both computational and experimental drug discovery processes. In this review, we outline a cohesive overview of both biological and computational applications in miRNA-targeted drug discovery, along with their disease implications and clinical significance. Finally, utilizing drug-target interaction (DTIs) data from DrugBank, we showcase the effectiveness of deep learning for obtaining the physicochemical characterization of DTIs

    Detection of regulator genes and eQTLs in gene networks

    Full text link
    Genetic differences between individuals associated to quantitative phenotypic traits, including disease states, are usually found in non-coding genomic regions. These genetic variants are often also associated to differences in expression levels of nearby genes (they are "expression quantitative trait loci" or eQTLs for short) and presumably play a gene regulatory role, affecting the status of molecular networks of interacting genes, proteins and metabolites. Computational systems biology approaches to reconstruct causal gene networks from large-scale omics data have therefore become essential to understand the structure of networks controlled by eQTLs together with other regulatory genes, and to generate detailed hypotheses about the molecular mechanisms that lead from genotype to phenotype. Here we review the main analytical methods and softwares to identify eQTLs and their associated genes, to reconstruct co-expression networks and modules, to reconstruct causal Bayesian gene and module networks, and to validate predicted networks in silico.Comment: minor revision with typos corrected; review article; 24 pages, 2 figure
    corecore