80,029 research outputs found

    Community Detection on Evolving Graphs

    Get PDF
    Clustering is a fundamental step in many information-retrieval and data-mining applications. Detecting clusters in graphs is also a key tool for finding the community structure in social and behavioral networks. In many of these applications, the input graph evolves over time in a continual and decentralized manner, and, to maintain a good clustering, the clustering algorithm needs to repeatedly probe the graph. Furthermore, there are often limitations on the frequency of such probes, either imposed explicitly by the online platform (e.g., in the case of crawling proprietary social networks like twitter) or implicitly because of resource limitations (e.g., in the case of crawling the web). In this paper, we study a model of clustering on evolving graphs that captures this aspect of the problem. Our model is based on the classical stochastic block model, which has been used to assess rigorously the quality of various static clustering methods. In our model, the algorithm is supposed to reconstruct the planted clustering, given the ability to query for small pieces of local information about the graph, at a limited rate. We design and analyze clustering algorithms that work in this model, and show asymptotically tight upper and lower bounds on their accuracy. Finally, we perform simulations, which demonstrate that our main asymptotic results hold true also in practice

    Probabilistic models for protein conformational changes

    Get PDF
    Proteins are macromolecules that perform multiple functions. They are not rigid molecules, but instead proteins can change their conformation to perform critical tasks driven by binding small ligands, by assembling into large macromolecular complexes or by physiological factors. Characterization of protein conformational change and analyzing transitional pathways along protein conformational states are essentially tasks for computational biology. Here we propose probabilistic models to characterize protein conformational change. The first model disentangles protein structure into rigid bodies, whereas the second model proposes the probabilistic network model for the transitions between conformational states. Our first model is a generative process using Gaussian mixture models to represent rigid domains, which generated the input structures through spatial transformation. To estimate our model parameters, we use two approaches: using deterministic Expectation- Maximization algorithm and stochastic Gibbs sampler. The second model is an elastic way to expand the application spectrum of our model. The model uses anharmonic springs that involve the molecular distances that are allowed to break in a stochastic fashion. The function of the spring potential is inferred from a statistical analysis of a database of large-scale conformational changes in proteins. In addition we deploy our model in a webservice, as well as we deposit a precomputed dataset of rigid domains and a selective dataset of conformational pathway between conformational states. Finally, we employ graph-based algorithms to solve the problem of a model-free base solution. This work is not limited to biological applications, but can also be applied to robotics and computer vision. This thesis is based on the following publications and manuscripts, respectively: • Thach Nguyen, Michael Habeck, A probabilistic model for detecting rigid domains in protein structures, Bioinformatics, Volume 32, Issue 17, 1 September 2016, Pages i710–i717, https://doi.org/10.1093/bioinformatics/btw442 • Habeck M, Nguyen T. A probabilistic network model for structural transitions in biomolecules.Proteins. 2018;86:634–643.https://doi.org/10.1002/prot.25490 • Linh Dang, Thach Nguyen, Michael Habeck, and StephanWaack. A graph-based algorithm for detecting rigid domains in protein structures. Submitted • Thach Nguyen, Christian Böhm, Michael Habeck, A computational web server for segmenting protein structure into rigid bodies, in preparation
    • …
    corecore