156 research outputs found
Probabilistic Bag-Of-Hyperlinks Model for Entity Linking
Many fundamental problems in natural language processing rely on determining
what entities appear in a given text. Commonly referenced as entity linking,
this step is a fundamental component of many NLP tasks such as text
understanding, automatic summarization, semantic search or machine translation.
Name ambiguity, word polysemy, context dependencies and a heavy-tailed
distribution of entities contribute to the complexity of this problem.
We here propose a probabilistic approach that makes use of an effective
graphical model to perform collective entity disambiguation. Input mentions
(i.e.,~linkable token spans) are disambiguated jointly across an entire
document by combining a document-level prior of entity co-occurrences with
local information captured from mentions and their surrounding context. The
model is based on simple sufficient statistics extracted from data, thus
relying on few parameters to be learned.
Our method does not require extensive feature engineering, nor an expensive
training procedure. We use loopy belief propagation to perform approximate
inference. The low complexity of our model makes this step sufficiently fast
for real-time usage. We demonstrate the accuracy of our approach on a wide
range of benchmark datasets, showing that it matches, and in many cases
outperforms, existing state-of-the-art methods
Role of the Tracy-Widom distribution in the finite-size fluctuations of the critical temperature of the Sherrington-Kirkpatrick spin glass
We investigate the finite-size fluctuations due to quenched disorder of the
critical temperature of the Sherrington-Kirkpatrick spin glass. In order to
accomplish this task, we perform a finite-size analysis of the spectrum of the
susceptibility matrix obtained via the Plefka expansion. By exploiting results
from random matrix theory, we obtain that the fluctuations of the critical
temperature are described by the Tracy-Widom distribution with a non-trivial
scaling exponent 2/3
Inference by replication in densely connected systems
An efficient Bayesian inference method for problems that can be mapped onto
dense graphs is presented. The approach is based on message passing where
messages are averaged over a large number of replicated variable systems
exposed to the same evidential nodes. An assumption about the symmetry of the
solutions is required for carrying out the averages; here we extend the
previous derivation based on a replica symmetric (RS) like structure to include
a more complex one-step replica symmetry breaking (1RSB)-like ansatz. To
demonstrate the potential of the approach it is employed for studying critical
properties of the Ising linear perceptron and for multiuser detection in Code
Division Multiple Access (CDMA) under different noise models. Results obtained
under the RS assumption in the non-critical regime give rise to a highly
efficient signal detection algorithm in the context of CDMA; while in the
critical regime one observes a first order transition line that ends in a
continuous phase transition point. Finite size effects are also observed. While
the 1RSB ansatz is not required for the original problems, it was applied to
the CDMA signal detection problem with a more complex noise model that exhibits
RSB behaviour, resulting in an improvement in performance.Comment: 47 pages, 7 figure
Intrinsic limitations of inverse inference in the pairwise Ising spin glass
We analyze the limits inherent to the inverse reconstruction of a pairwise
Ising spin glass based on susceptibility propagation. We establish the
conditions under which the susceptibility propagation algorithm is able to
reconstruct the characteristics of the network given first- and second-order
local observables, evaluate eventual errors due to various types of noise in
the originally observed data, and discuss the scaling of the problem with the
number of degrees of freedom
Belief-propagation algorithm and the Ising model on networks with arbitrary distributions of motifs
We generalize the belief-propagation algorithm to sparse random networks with
arbitrary distributions of motifs (triangles, loops, etc.). Each vertex in
these networks belongs to a given set of motifs (generalization of the
configuration model). These networks can be treated as sparse uncorrelated
hypergraphs in which hyperedges represent motifs. Here a hypergraph is a
generalization of a graph, where a hyperedge can connect any number of
vertices. These uncorrelated hypergraphs are tree-like (hypertrees), which
crucially simplify the problem and allow us to apply the belief-propagation
algorithm to these loopy networks with arbitrary motifs. As natural examples,
we consider motifs in the form of finite loops and cliques. We apply the
belief-propagation algorithm to the ferromagnetic Ising model on the resulting
random networks. We obtain an exact solution of this model on networks with
finite loops or cliques as motifs. We find an exact critical temperature of the
ferromagnetic phase transition and demonstrate that with increasing the
clustering coefficient and the loop size, the critical temperature increases
compared to ordinary tree-like complex networks. Our solution also gives the
birth point of the giant connected component in these loopy networks.Comment: 9 pages, 4 figure
Bethe-Peierls approximation and the inverse Ising model
We apply the Bethe-Peierls approximation to the problem of the inverse Ising
model and show how the linear response relation leads to a simple method to
reconstruct couplings and fields of the Ising model. This reconstruction is
exact on tree graphs, yet its computational expense is comparable to other
mean-field methods. We compare the performance of this method to the
independent-pair, naive mean- field, Thouless-Anderson-Palmer approximations,
the Sessak-Monasson expansion, and susceptibility propagation in the Cayley
tree, SK-model and random graph with fixed connectivity. At low temperatures,
Bethe reconstruction outperforms all these methods, while at high temperatures
it is comparable to the best method available so far (Sessak-Monasson). The
relationship between Bethe reconstruction and other mean- field methods is
discussed
Statistical Mechanics of maximal independent sets
The graph theoretic concept of maximal independent set arises in several
practical problems in computer science as well as in game theory. A maximal
independent set is defined by the set of occupied nodes that satisfy some
packing and covering constraints. It is known that finding minimum and
maximum-density maximal independent sets are hard optimization problems. In
this paper, we use cavity method of statistical physics and Monte Carlo
simulations to study the corresponding constraint satisfaction problem on
random graphs. We obtain the entropy of maximal independent sets within the
replica symmetric and one-step replica symmetry breaking frameworks, shedding
light on the metric structure of the landscape of solutions and suggesting a
class of possible algorithms. This is of particular relevance for the
application to the study of strategic interactions in social and economic
networks, where maximal independent sets correspond to pure Nash equilibria of
a graphical game of public goods allocation
Improved message passing for inference in densely connected systems
An improved inference method for densely connected systems is presented. The
approach is based on passing condensed messages between variables, representing
macroscopic averages of microscopic messages. We extend previous work that
showed promising results in cases where the solution space is contiguous to
cases where fragmentation occurs. We apply the method to the signal detection
problem of Code Division Multiple Access (CDMA) for demonstrating its
potential. A highly efficient practical algorithm is also derived on the basis
of insight gained from the analysis
The number of matchings in random graphs
We study matchings on sparse random graphs by means of the cavity method. We
first show how the method reproduces several known results about maximum and
perfect matchings in regular and Erdos-Renyi random graphs. Our main new result
is the computation of the entropy, i.e. the leading order of the logarithm of
the number of solutions, of matchings with a given size. We derive both an
algorithm to compute this entropy for an arbitrary graph with a girth that
diverges in the large size limit, and an analytic result for the entropy in
regular and Erdos-Renyi random graph ensembles.Comment: 17 pages, 6 figures, to be published in Journal of Statistical
Mechanic
- …