12,118 research outputs found

    The Role of Normalization in the Belief Propagation Algorithm

    Get PDF
    An important part of problems in statistical physics and computer science can be expressed as the computation of marginal probabilities over a Markov Random Field. The belief propagation algorithm, which is an exact procedure to compute these marginals when the underlying graph is a tree, has gained its popularity as an efficient way to approximate them in the more general case. In this paper, we focus on an aspect of the algorithm that did not get that much attention in the literature, which is the effect of the normalization of the messages. We show in particular that, for a large class of normalization strategies, it is possible to focus only on belief convergence. Following this, we express the necessary and sufficient conditions for local stability of a fixed point in terms of the graph structure and the beliefs values at the fixed point. We also explicit some connexion between the normalization constants and the underlying Bethe Free Energy

    Local stability of Belief Propagation algorithm with multiple fixed points

    Get PDF
    A number of problems in statistical physics and computer science can be expressed as the computation of marginal probabilities over a Markov random field. Belief propagation, an iterative message-passing algorithm, computes exactly such marginals when the underlying graph is a tree. But it has gained its popularity as an efficient way to approximate them in the more general case, even if it can exhibits multiple fixed points and is not guaranteed to converge. In this paper, we express a new sufficient condition for local stability of a belief propagation fixed point in terms of the graph structure and the beliefs values at the fixed point. This gives credence to the usual understanding that Belief Propagation performs better on sparse graphs.Comment: arXiv admin note: substantial text overlap with arXiv:1101.417

    Belief-propagation algorithm and the Ising model on networks with arbitrary distributions of motifs

    Full text link
    We generalize the belief-propagation algorithm to sparse random networks with arbitrary distributions of motifs (triangles, loops, etc.). Each vertex in these networks belongs to a given set of motifs (generalization of the configuration model). These networks can be treated as sparse uncorrelated hypergraphs in which hyperedges represent motifs. Here a hypergraph is a generalization of a graph, where a hyperedge can connect any number of vertices. These uncorrelated hypergraphs are tree-like (hypertrees), which crucially simplify the problem and allow us to apply the belief-propagation algorithm to these loopy networks with arbitrary motifs. As natural examples, we consider motifs in the form of finite loops and cliques. We apply the belief-propagation algorithm to the ferromagnetic Ising model on the resulting random networks. We obtain an exact solution of this model on networks with finite loops or cliques as motifs. We find an exact critical temperature of the ferromagnetic phase transition and demonstrate that with increasing the clustering coefficient and the loop size, the critical temperature increases compared to ordinary tree-like complex networks. Our solution also gives the birth point of the giant connected component in these loopy networks.Comment: 9 pages, 4 figure

    Clustering by soft-constraint affinity propagation: Applications to gene-expression data

    Full text link
    Motivation: Similarity-measure based clustering is a crucial problem appearing throughout scientific data analysis. Recently, a powerful new algorithm called Affinity Propagation (AP) based on message-passing techniques was proposed by Frey and Dueck \cite{Frey07}. In AP, each cluster is identified by a common exemplar all other data points of the same cluster refer to, and exemplars have to refer to themselves. Albeit its proved power, AP in its present form suffers from a number of drawbacks. The hard constraint of having exactly one exemplar per cluster restricts AP to classes of regularly shaped clusters, and leads to suboptimal performance, {\it e.g.}, in analyzing gene expression data. Results: This limitation can be overcome by relaxing the AP hard constraints. A new parameter controls the importance of the constraints compared to the aim of maximizing the overall similarity, and allows to interpolate between the simple case where each data point selects its closest neighbor as an exemplar and the original AP. The resulting soft-constraint affinity propagation (SCAP) becomes more informative, accurate and leads to more stable clustering. Even though a new {\it a priori} free-parameter is introduced, the overall dependence of the algorithm on external tuning is reduced, as robustness is increased and an optimal strategy for parameter selection emerges more naturally. SCAP is tested on biological benchmark data, including in particular microarray data related to various cancer types. We show that the algorithm efficiently unveils the hierarchical cluster structure present in the data sets. Further on, it allows to extract sparse gene expression signatures for each cluster.Comment: 11 pages, supplementary material: http://isiosf.isi.it/~weigt/scap_supplement.pd

    Investigation of commuting Hamiltonian in quantum Markov network

    Full text link
    Graphical Models have various applications in science and engineering which include physics, bioinformatics, telecommunication and etc. Usage of graphical models needs complex computations in order to evaluation of marginal functions,so there are some powerful methods including mean field approximation, belief propagation algorithm and etc. Quantum graphical models have been recently developed in context of quantum information and computation, and quantum statistical physics, which is possible by generalization of classical probability theory to quantum theory. The main goal of this paper is preparing a primary generalization of Markov network, as a type of graphical models, to quantum case and applying in quantum statistical physics.We have investigated the Markov network and the role of commuting Hamiltonian terms in conditional independence with simple examples of quantum statistical physics.Comment: 11 pages, 8 figure

    Detection of Early-Stage Enterprise Infection by Mining Large-Scale Log Data

    Get PDF
    Recent years have seen the rise of more sophisticated attacks including advanced persistent threats (APTs) which pose severe risks to organizations and governments by targeting confidential proprietary information. Additionally, new malware strains are appearing at a higher rate than ever before. Since many of these malware are designed to evade existing security products, traditional defenses deployed by most enterprises today, e.g., anti-virus, firewalls, intrusion detection systems, often fail at detecting infections at an early stage. We address the problem of detecting early-stage infection in an enterprise setting by proposing a new framework based on belief propagation inspired from graph theory. Belief propagation can be used either with "seeds" of compromised hosts or malicious domains (provided by the enterprise security operation center -- SOC) or without any seeds. In the latter case we develop a detector of C&C communication particularly tailored to enterprises which can detect a stealthy compromise of only a single host communicating with the C&C server. We demonstrate that our techniques perform well on detecting enterprise infections. We achieve high accuracy with low false detection and false negative rates on two months of anonymized DNS logs released by Los Alamos National Lab (LANL), which include APT infection attacks simulated by LANL domain experts. We also apply our algorithms to 38TB of real-world web proxy logs collected at the border of a large enterprise. Through careful manual investigation in collaboration with the enterprise SOC, we show that our techniques identified hundreds of malicious domains overlooked by state-of-the-art security products

    Shaping the learning landscape in neural networks around wide flat minima

    Full text link
    Learning in Deep Neural Networks (DNN) takes place by minimizing a non-convex high-dimensional loss function, typically by a stochastic gradient descent (SGD) strategy. The learning process is observed to be able to find good minimizers without getting stuck in local critical points, and that such minimizers are often satisfactory at avoiding overfitting. How these two features can be kept under control in nonlinear devices composed of millions of tunable connections is a profound and far reaching open question. In this paper we study basic non-convex one- and two-layer neural network models which learn random patterns, and derive a number of basic geometrical and algorithmic features which suggest some answers. We first show that the error loss function presents few extremely wide flat minima (WFM) which coexist with narrower minima and critical points. We then show that the minimizers of the cross-entropy loss function overlap with the WFM of the error loss. We also show examples of learning devices for which WFM do not exist. From the algorithmic perspective we derive entropy driven greedy and message passing algorithms which focus their search on wide flat regions of minimizers. In the case of SGD and cross-entropy loss, we show that a slow reduction of the norm of the weights along the learning process also leads to WFM. We corroborate the results by a numerical study of the correlations between the volumes of the minimizers, their Hessian and their generalization performance on real data.Comment: 37 pages (16 main text), 10 figures (7 main text

    Cluster Variation Method in Statistical Physics and Probabilistic Graphical Models

    Full text link
    The cluster variation method (CVM) is a hierarchy of approximate variational techniques for discrete (Ising--like) models in equilibrium statistical mechanics, improving on the mean--field approximation and the Bethe--Peierls approximation, which can be regarded as the lowest level of the CVM. In recent years it has been applied both in statistical physics and to inference and optimization problems formulated in terms of probabilistic graphical models. The foundations of the CVM are briefly reviewed, and the relations with similar techniques are discussed. The main properties of the method are considered, with emphasis on its exactness for particular models and on its asymptotic properties. The problem of the minimization of the variational free energy, which arises in the CVM, is also addressed, and recent results about both provably convergent and message-passing algorithms are discussed.Comment: 36 pages, 17 figure
    • …
    corecore