177 research outputs found

    Probabilistic Bag-Of-Hyperlinks Model for Entity Linking

    Full text link
    Many fundamental problems in natural language processing rely on determining what entities appear in a given text. Commonly referenced as entity linking, this step is a fundamental component of many NLP tasks such as text understanding, automatic summarization, semantic search or machine translation. Name ambiguity, word polysemy, context dependencies and a heavy-tailed distribution of entities contribute to the complexity of this problem. We here propose a probabilistic approach that makes use of an effective graphical model to perform collective entity disambiguation. Input mentions (i.e.,~linkable token spans) are disambiguated jointly across an entire document by combining a document-level prior of entity co-occurrences with local information captured from mentions and their surrounding context. The model is based on simple sufficient statistics extracted from data, thus relying on few parameters to be learned. Our method does not require extensive feature engineering, nor an expensive training procedure. We use loopy belief propagation to perform approximate inference. The low complexity of our model makes this step sufficiently fast for real-time usage. We demonstrate the accuracy of our approach on a wide range of benchmark datasets, showing that it matches, and in many cases outperforms, existing state-of-the-art methods

    Dynamical replica analysis of disordered Ising spin systems on finitely connected random graphs

    Full text link
    We study the dynamics of macroscopic observables such as the magnetization and the energy per degree of freedom in Ising spin models on random graphs of finite connectivity, with random bonds and/or heterogeneous degree distributions. To do so we generalize existing implementations of dynamical replica theory and cavity field techniques to systems with strongly disordered and locally tree-like interactions. We illustrate our results via application to the dynamics of e.g. ±J\pm J spin-glasses on random graphs and of the overlap in finite connectivity Sourlas codes. All results are tested against Monte Carlo simulations.Comment: 4 pages, 14 .eps file

    Phase Transitions and Computational Difficulty in Random Constraint Satisfaction Problems

    Full text link
    We review the understanding of the random constraint satisfaction problems, focusing on the q-coloring of large random graphs, that has been achieved using the cavity method of the physicists. We also discuss the properties of the phase diagram in temperature, the connections with the glass transition phenomenology in physics, and the related algorithmic issues.Comment: 10 pages, Proceedings of the International Workshop on Statistical-Mechanical Informatics 2007, Kyoto (Japan) September 16-19, 200

    Mean-Field Equations for Spin Models with Orthogonal Interaction Matrices

    Full text link
    We study the metastable states in Ising spin models with orthogonal interaction matrices. We focus on three realizations of this model, the random case and two non-random cases, i.e.\ the fully-frustrated model on an infinite dimensional hypercube and the so-called sine-model. We use the mean-field (or {\sc tap}) equations which we derive by resuming the high-temperature expansion of the Gibbs free energy. In some special non-random cases, we can find the absolute minimum of the free energy. For the random case we compute the average number of solutions to the {\sc tap} equations. We find that the configurational entropy (or complexity) is extensive in the range T_{\mbox{\tiny RSB}}. Finally we present an apparently unrelated replica calculation which reproduces the analytical expression for the total number of {\sc tap} solutions.Comment: 22+3 pages, section 5 slightly modified, 1 Ref added, LaTeX and uuencoded figures now independent of each other (easier to print). Postscript available http://chimera.roma1.infn.it/index_papers_complex.htm

    A survey on independence-based Markov networks learning

    Full text link
    This work reports the most relevant technical aspects in the problem of learning the \emph{Markov network structure} from data. Such problem has become increasingly important in machine learning, and many other application fields of machine learning. Markov networks, together with Bayesian networks, are probabilistic graphical models, a widely used formalism for handling probability distributions in intelligent systems. Learning graphical models from data have been extensively applied for the case of Bayesian networks, but for Markov networks learning it is not tractable in practice. However, this situation is changing with time, given the exponential growth of computers capacity, the plethora of available digital data, and the researching on new learning technologies. This work stresses on a technology called independence-based learning, which allows the learning of the independence structure of those networks from data in an efficient and sound manner, whenever the dataset is sufficiently large, and data is a representative sampling of the target distribution. In the analysis of such technology, this work surveys the current state-of-the-art algorithms for learning Markov networks structure, discussing its current limitations, and proposing a series of open problems where future works may produce some advances in the area in terms of quality and efficiency. The paper concludes by opening a discussion about how to develop a general formalism for improving the quality of the structures learned, when data is scarce.Comment: 35 pages, 1 figur

    Region graph partition function expansion and approximate free energy landscapes: Theory and some numerical results

    Full text link
    Graphical models for finite-dimensional spin glasses and real-world combinatorial optimization and satisfaction problems usually have an abundant number of short loops. The cluster variation method and its extension, the region graph method, are theoretical approaches for treating the complicated short-loop-induced local correlations. For graphical models represented by non-redundant or redundant region graphs, approximate free energy landscapes are constructed in this paper through the mathematical framework of region graph partition function expansion. Several free energy functionals are obtained, each of which use a set of probability distribution functions or functionals as order parameters. These probability distribution function/functionals are required to satisfy the region graph belief-propagation equation or the region graph survey-propagation equation to ensure vanishing correction contributions of region subgraphs with dangling edges. As a simple application of the general theory, we perform region graph belief-propagation simulations on the square-lattice ferromagnetic Ising model and the Edwards-Anderson model. Considerable improvements over the conventional Bethe-Peierls approximation are achieved. Collective domains of different sizes in the disordered and frustrated square lattice are identified by the message-passing procedure. Such collective domains and the frustrations among them are responsible for the low-temperature glass-like dynamical behaviors of the system.Comment: 30 pages, 11 figures. More discussion on redundant region graphs. To be published by Journal of Statistical Physic

    Probabilistic Reconstruction in Compressed Sensing: Algorithms, Phase Diagrams, and Threshold Achieving Matrices

    Full text link
    Compressed sensing is a signal processing method that acquires data directly in a compressed form. This allows one to make less measurements than what was considered necessary to record a signal, enabling faster or more precise measurement protocols in a wide range of applications. Using an interdisciplinary approach, we have recently proposed in [arXiv:1109.4424] a strategy that allows compressed sensing to be performed at acquisition rates approaching to the theoretical optimal limits. In this paper, we give a more thorough presentation of our approach, and introduce many new results. We present the probabilistic approach to reconstruction and discuss its optimality and robustness. We detail the derivation of the message passing algorithm for reconstruction and expectation max- imization learning of signal-model parameters. We further develop the asymptotic analysis of the corresponding phase diagrams with and without measurement noise, for different distribution of signals, and discuss the best possible reconstruction performances regardless of the algorithm. We also present new efficient seeding matrices, test them on synthetic data and analyze their performance asymptotically.Comment: 42 pages, 37 figures, 3 appendixe

    Critical phenomena in complex networks

    Full text link
    The combination of the compactness of networks, featuring small diameters, and their complex architectures results in a variety of critical effects dramatically different from those in cooperative systems on lattices. In the last few years, researchers have made important steps toward understanding the qualitatively new critical phenomena in complex networks. We review the results, concepts, and methods of this rapidly developing field. Here we mostly consider two closely related classes of these critical phenomena, namely structural phase transitions in the network architectures and transitions in cooperative models on networks as substrates. We also discuss systems where a network and interacting agents on it influence each other. We overview a wide range of critical phenomena in equilibrium and growing networks including the birth of the giant connected component, percolation, k-core percolation, phenomena near epidemic thresholds, condensation transitions, critical phenomena in spin models placed on networks, synchronization, and self-organized criticality effects in interacting systems on networks. We also discuss strong finite size effects in these systems and highlight open problems and perspectives.Comment: Review article, 79 pages, 43 figures, 1 table, 508 references, extende

    Networking - A Statistical Physics Perspective

    Get PDF
    Efficient networking has a substantial economic and societal impact in a broad range of areas including transportation systems, wired and wireless communications and a range of Internet applications. As transportation and communication networks become increasingly more complex, the ever increasing demand for congestion control, higher traffic capacity, quality of service, robustness and reduced energy consumption require new tools and methods to meet these conflicting requirements. The new methodology should serve for gaining better understanding of the properties of networking systems at the macroscopic level, as well as for the development of new principled optimization and management algorithms at the microscopic level. Methods of statistical physics seem best placed to provide new approaches as they have been developed specifically to deal with non-linear large scale systems. This paper aims at presenting an overview of tools and methods that have been developed within the statistical physics community and that can be readily applied to address the emerging problems in networking. These include diffusion processes, methods from disordered systems and polymer physics, probabilistic inference, which have direct relevance to network routing, file and frequency distribution, the exploration of network structures and vulnerability, and various other practical networking applications.Comment: (Review article) 71 pages, 14 figure
    • …
    corecore