177 research outputs found
Probabilistic Bag-Of-Hyperlinks Model for Entity Linking
Many fundamental problems in natural language processing rely on determining
what entities appear in a given text. Commonly referenced as entity linking,
this step is a fundamental component of many NLP tasks such as text
understanding, automatic summarization, semantic search or machine translation.
Name ambiguity, word polysemy, context dependencies and a heavy-tailed
distribution of entities contribute to the complexity of this problem.
We here propose a probabilistic approach that makes use of an effective
graphical model to perform collective entity disambiguation. Input mentions
(i.e.,~linkable token spans) are disambiguated jointly across an entire
document by combining a document-level prior of entity co-occurrences with
local information captured from mentions and their surrounding context. The
model is based on simple sufficient statistics extracted from data, thus
relying on few parameters to be learned.
Our method does not require extensive feature engineering, nor an expensive
training procedure. We use loopy belief propagation to perform approximate
inference. The low complexity of our model makes this step sufficiently fast
for real-time usage. We demonstrate the accuracy of our approach on a wide
range of benchmark datasets, showing that it matches, and in many cases
outperforms, existing state-of-the-art methods
Dynamical replica analysis of disordered Ising spin systems on finitely connected random graphs
We study the dynamics of macroscopic observables such as the magnetization
and the energy per degree of freedom in Ising spin models on random graphs of
finite connectivity, with random bonds and/or heterogeneous degree
distributions. To do so we generalize existing implementations of dynamical
replica theory and cavity field techniques to systems with strongly disordered
and locally tree-like interactions. We illustrate our results via application
to the dynamics of e.g. spin-glasses on random graphs and of the
overlap in finite connectivity Sourlas codes. All results are tested against
Monte Carlo simulations.Comment: 4 pages, 14 .eps file
Phase Transitions and Computational Difficulty in Random Constraint Satisfaction Problems
We review the understanding of the random constraint satisfaction problems,
focusing on the q-coloring of large random graphs, that has been achieved using
the cavity method of the physicists. We also discuss the properties of the
phase diagram in temperature, the connections with the glass transition
phenomenology in physics, and the related algorithmic issues.Comment: 10 pages, Proceedings of the International Workshop on
Statistical-Mechanical Informatics 2007, Kyoto (Japan) September 16-19, 200
Mean-Field Equations for Spin Models with Orthogonal Interaction Matrices
We study the metastable states in Ising spin models with orthogonal
interaction matrices. We focus on three realizations of this model, the random
case and two non-random cases, i.e.\ the fully-frustrated model on an infinite
dimensional hypercube and the so-called sine-model. We use the mean-field (or
{\sc tap}) equations which we derive by resuming the high-temperature expansion
of the Gibbs free energy. In some special non-random cases, we can find the
absolute minimum of the free energy. For the random case we compute the average
number of solutions to the {\sc tap} equations. We find that the
configurational entropy (or complexity) is extensive in the range
T_{\mbox{\tiny RSB}}. Finally we present an apparently
unrelated replica calculation which reproduces the analytical expression for
the total number of {\sc tap} solutions.Comment: 22+3 pages, section 5 slightly modified, 1 Ref added, LaTeX and
uuencoded figures now independent of each other (easier to print). Postscript
available http://chimera.roma1.infn.it/index_papers_complex.htm
A survey on independence-based Markov networks learning
This work reports the most relevant technical aspects in the problem of
learning the \emph{Markov network structure} from data. Such problem has become
increasingly important in machine learning, and many other application fields
of machine learning. Markov networks, together with Bayesian networks, are
probabilistic graphical models, a widely used formalism for handling
probability distributions in intelligent systems. Learning graphical models
from data have been extensively applied for the case of Bayesian networks, but
for Markov networks learning it is not tractable in practice. However, this
situation is changing with time, given the exponential growth of computers
capacity, the plethora of available digital data, and the researching on new
learning technologies. This work stresses on a technology called
independence-based learning, which allows the learning of the independence
structure of those networks from data in an efficient and sound manner,
whenever the dataset is sufficiently large, and data is a representative
sampling of the target distribution. In the analysis of such technology, this
work surveys the current state-of-the-art algorithms for learning Markov
networks structure, discussing its current limitations, and proposing a series
of open problems where future works may produce some advances in the area in
terms of quality and efficiency. The paper concludes by opening a discussion
about how to develop a general formalism for improving the quality of the
structures learned, when data is scarce.Comment: 35 pages, 1 figur
Region graph partition function expansion and approximate free energy landscapes: Theory and some numerical results
Graphical models for finite-dimensional spin glasses and real-world
combinatorial optimization and satisfaction problems usually have an abundant
number of short loops. The cluster variation method and its extension, the
region graph method, are theoretical approaches for treating the complicated
short-loop-induced local correlations. For graphical models represented by
non-redundant or redundant region graphs, approximate free energy landscapes
are constructed in this paper through the mathematical framework of region
graph partition function expansion. Several free energy functionals are
obtained, each of which use a set of probability distribution functions or
functionals as order parameters. These probability distribution
function/functionals are required to satisfy the region graph
belief-propagation equation or the region graph survey-propagation equation to
ensure vanishing correction contributions of region subgraphs with dangling
edges. As a simple application of the general theory, we perform region graph
belief-propagation simulations on the square-lattice ferromagnetic Ising model
and the Edwards-Anderson model. Considerable improvements over the conventional
Bethe-Peierls approximation are achieved. Collective domains of different sizes
in the disordered and frustrated square lattice are identified by the
message-passing procedure. Such collective domains and the frustrations among
them are responsible for the low-temperature glass-like dynamical behaviors of
the system.Comment: 30 pages, 11 figures. More discussion on redundant region graphs. To
be published by Journal of Statistical Physic
Probabilistic Reconstruction in Compressed Sensing: Algorithms, Phase Diagrams, and Threshold Achieving Matrices
Compressed sensing is a signal processing method that acquires data directly
in a compressed form. This allows one to make less measurements than what was
considered necessary to record a signal, enabling faster or more precise
measurement protocols in a wide range of applications. Using an
interdisciplinary approach, we have recently proposed in [arXiv:1109.4424] a
strategy that allows compressed sensing to be performed at acquisition rates
approaching to the theoretical optimal limits. In this paper, we give a more
thorough presentation of our approach, and introduce many new results. We
present the probabilistic approach to reconstruction and discuss its optimality
and robustness. We detail the derivation of the message passing algorithm for
reconstruction and expectation max- imization learning of signal-model
parameters. We further develop the asymptotic analysis of the corresponding
phase diagrams with and without measurement noise, for different distribution
of signals, and discuss the best possible reconstruction performances
regardless of the algorithm. We also present new efficient seeding matrices,
test them on synthetic data and analyze their performance asymptotically.Comment: 42 pages, 37 figures, 3 appendixe
Is it possible to improve residents breaking bad news skills? A randomised study assessing the efficacy of a communication skills training program
info:eu-repo/semantics/publishe
Critical phenomena in complex networks
The combination of the compactness of networks, featuring small diameters,
and their complex architectures results in a variety of critical effects
dramatically different from those in cooperative systems on lattices. In the
last few years, researchers have made important steps toward understanding the
qualitatively new critical phenomena in complex networks. We review the
results, concepts, and methods of this rapidly developing field. Here we mostly
consider two closely related classes of these critical phenomena, namely
structural phase transitions in the network architectures and transitions in
cooperative models on networks as substrates. We also discuss systems where a
network and interacting agents on it influence each other. We overview a wide
range of critical phenomena in equilibrium and growing networks including the
birth of the giant connected component, percolation, k-core percolation,
phenomena near epidemic thresholds, condensation transitions, critical
phenomena in spin models placed on networks, synchronization, and
self-organized criticality effects in interacting systems on networks. We also
discuss strong finite size effects in these systems and highlight open problems
and perspectives.Comment: Review article, 79 pages, 43 figures, 1 table, 508 references,
extende
Networking - A Statistical Physics Perspective
Efficient networking has a substantial economic and societal impact in a
broad range of areas including transportation systems, wired and wireless
communications and a range of Internet applications. As transportation and
communication networks become increasingly more complex, the ever increasing
demand for congestion control, higher traffic capacity, quality of service,
robustness and reduced energy consumption require new tools and methods to meet
these conflicting requirements. The new methodology should serve for gaining
better understanding of the properties of networking systems at the macroscopic
level, as well as for the development of new principled optimization and
management algorithms at the microscopic level. Methods of statistical physics
seem best placed to provide new approaches as they have been developed
specifically to deal with non-linear large scale systems. This paper aims at
presenting an overview of tools and methods that have been developed within the
statistical physics community and that can be readily applied to address the
emerging problems in networking. These include diffusion processes, methods
from disordered systems and polymer physics, probabilistic inference, which
have direct relevance to network routing, file and frequency distribution, the
exploration of network structures and vulnerability, and various other
practical networking applications.Comment: (Review article) 71 pages, 14 figure
- …