156 research outputs found

    Probabilistic Bag-Of-Hyperlinks Model for Entity Linking

    Full text link
    Many fundamental problems in natural language processing rely on determining what entities appear in a given text. Commonly referenced as entity linking, this step is a fundamental component of many NLP tasks such as text understanding, automatic summarization, semantic search or machine translation. Name ambiguity, word polysemy, context dependencies and a heavy-tailed distribution of entities contribute to the complexity of this problem. We here propose a probabilistic approach that makes use of an effective graphical model to perform collective entity disambiguation. Input mentions (i.e.,~linkable token spans) are disambiguated jointly across an entire document by combining a document-level prior of entity co-occurrences with local information captured from mentions and their surrounding context. The model is based on simple sufficient statistics extracted from data, thus relying on few parameters to be learned. Our method does not require extensive feature engineering, nor an expensive training procedure. We use loopy belief propagation to perform approximate inference. The low complexity of our model makes this step sufficiently fast for real-time usage. We demonstrate the accuracy of our approach on a wide range of benchmark datasets, showing that it matches, and in many cases outperforms, existing state-of-the-art methods

    Role of the Tracy-Widom distribution in the finite-size fluctuations of the critical temperature of the Sherrington-Kirkpatrick spin glass

    Full text link
    We investigate the finite-size fluctuations due to quenched disorder of the critical temperature of the Sherrington-Kirkpatrick spin glass. In order to accomplish this task, we perform a finite-size analysis of the spectrum of the susceptibility matrix obtained via the Plefka expansion. By exploiting results from random matrix theory, we obtain that the fluctuations of the critical temperature are described by the Tracy-Widom distribution with a non-trivial scaling exponent 2/3

    Inference by replication in densely connected systems

    Get PDF
    An efficient Bayesian inference method for problems that can be mapped onto dense graphs is presented. The approach is based on message passing where messages are averaged over a large number of replicated variable systems exposed to the same evidential nodes. An assumption about the symmetry of the solutions is required for carrying out the averages; here we extend the previous derivation based on a replica symmetric (RS) like structure to include a more complex one-step replica symmetry breaking (1RSB)-like ansatz. To demonstrate the potential of the approach it is employed for studying critical properties of the Ising linear perceptron and for multiuser detection in Code Division Multiple Access (CDMA) under different noise models. Results obtained under the RS assumption in the non-critical regime give rise to a highly efficient signal detection algorithm in the context of CDMA; while in the critical regime one observes a first order transition line that ends in a continuous phase transition point. Finite size effects are also observed. While the 1RSB ansatz is not required for the original problems, it was applied to the CDMA signal detection problem with a more complex noise model that exhibits RSB behaviour, resulting in an improvement in performance.Comment: 47 pages, 7 figure

    Intrinsic limitations of inverse inference in the pairwise Ising spin glass

    Full text link
    We analyze the limits inherent to the inverse reconstruction of a pairwise Ising spin glass based on susceptibility propagation. We establish the conditions under which the susceptibility propagation algorithm is able to reconstruct the characteristics of the network given first- and second-order local observables, evaluate eventual errors due to various types of noise in the originally observed data, and discuss the scaling of the problem with the number of degrees of freedom

    Belief-propagation algorithm and the Ising model on networks with arbitrary distributions of motifs

    Full text link
    We generalize the belief-propagation algorithm to sparse random networks with arbitrary distributions of motifs (triangles, loops, etc.). Each vertex in these networks belongs to a given set of motifs (generalization of the configuration model). These networks can be treated as sparse uncorrelated hypergraphs in which hyperedges represent motifs. Here a hypergraph is a generalization of a graph, where a hyperedge can connect any number of vertices. These uncorrelated hypergraphs are tree-like (hypertrees), which crucially simplify the problem and allow us to apply the belief-propagation algorithm to these loopy networks with arbitrary motifs. As natural examples, we consider motifs in the form of finite loops and cliques. We apply the belief-propagation algorithm to the ferromagnetic Ising model on the resulting random networks. We obtain an exact solution of this model on networks with finite loops or cliques as motifs. We find an exact critical temperature of the ferromagnetic phase transition and demonstrate that with increasing the clustering coefficient and the loop size, the critical temperature increases compared to ordinary tree-like complex networks. Our solution also gives the birth point of the giant connected component in these loopy networks.Comment: 9 pages, 4 figure

    Bethe-Peierls approximation and the inverse Ising model

    Full text link
    We apply the Bethe-Peierls approximation to the problem of the inverse Ising model and show how the linear response relation leads to a simple method to reconstruct couplings and fields of the Ising model. This reconstruction is exact on tree graphs, yet its computational expense is comparable to other mean-field methods. We compare the performance of this method to the independent-pair, naive mean- field, Thouless-Anderson-Palmer approximations, the Sessak-Monasson expansion, and susceptibility propagation in the Cayley tree, SK-model and random graph with fixed connectivity. At low temperatures, Bethe reconstruction outperforms all these methods, while at high temperatures it is comparable to the best method available so far (Sessak-Monasson). The relationship between Bethe reconstruction and other mean- field methods is discussed

    Statistical Mechanics of maximal independent sets

    Full text link
    The graph theoretic concept of maximal independent set arises in several practical problems in computer science as well as in game theory. A maximal independent set is defined by the set of occupied nodes that satisfy some packing and covering constraints. It is known that finding minimum and maximum-density maximal independent sets are hard optimization problems. In this paper, we use cavity method of statistical physics and Monte Carlo simulations to study the corresponding constraint satisfaction problem on random graphs. We obtain the entropy of maximal independent sets within the replica symmetric and one-step replica symmetry breaking frameworks, shedding light on the metric structure of the landscape of solutions and suggesting a class of possible algorithms. This is of particular relevance for the application to the study of strategic interactions in social and economic networks, where maximal independent sets correspond to pure Nash equilibria of a graphical game of public goods allocation

    Improved message passing for inference in densely connected systems

    Get PDF
    An improved inference method for densely connected systems is presented. The approach is based on passing condensed messages between variables, representing macroscopic averages of microscopic messages. We extend previous work that showed promising results in cases where the solution space is contiguous to cases where fragmentation occurs. We apply the method to the signal detection problem of Code Division Multiple Access (CDMA) for demonstrating its potential. A highly efficient practical algorithm is also derived on the basis of insight gained from the analysis

    The number of matchings in random graphs

    Full text link
    We study matchings on sparse random graphs by means of the cavity method. We first show how the method reproduces several known results about maximum and perfect matchings in regular and Erdos-Renyi random graphs. Our main new result is the computation of the entropy, i.e. the leading order of the logarithm of the number of solutions, of matchings with a given size. We derive both an algorithm to compute this entropy for an arbitrary graph with a girth that diverges in the large size limit, and an analytic result for the entropy in regular and Erdos-Renyi random graph ensembles.Comment: 17 pages, 6 figures, to be published in Journal of Statistical Mechanic
    • …
    corecore