4,630,122 research outputs found

    Bayesian Information Extraction Network

    Full text link
    Dynamic Bayesian networks (DBNs) offer an elegant way to integrate various aspects of language in one model. Many existing algorithms developed for learning and inference in DBNs are applicable to probabilistic language modeling. To demonstrate the potential of DBNs for natural language processing, we employ a DBN in an information extraction task. We show how to assemble wealth of emerging linguistic instruments for shallow parsing, syntactic and semantic tagging, morphological decomposition, named entity recognition etc. in order to incrementally build a robust information extraction system. Our method outperforms previously published results on an established benchmark domain.Comment: 6 page

    Pairwise Network Information and Nonlinear Correlations

    Full text link
    Reconstructing the structural connectivity between interacting units from observed activity is a challenge across many different disciplines. The fundamental first step is to establish whether or to what extent the interactions between the units can be considered pairwise and, thus, can be modeled as an interaction network with simple links corresponding to pairwise interactions. In principle this can be determined by comparing the maximum entropy given the bivariate probability distributions to the true joint entropy. In many practical cases this is not an option since the bivariate distributions needed may not be reliably estimated, or the optimization is too computationally expensive. Here we present an approach that allows one to use mutual informations as a proxy for the bivariate distributions. This has the advantage of being less computationally expensive and easier to estimate. We achieve this by introducing a novel entropy maximization scheme that is based on conditioning on entropies and mutual informations. This renders our approach typically superior to other methods based on linear approximations. The advantages of the proposed method are documented using oscillator networks and a resting-state human brain network as generic relevant examples

    Lecture Notes on Network Information Theory

    Full text link
    These lecture notes have been converted to a book titled Network Information Theory published recently by Cambridge University Press. This book provides a significantly expanded exposition of the material in the lecture notes as well as problems and bibliographic notes at the end of each chapter. The authors are currently preparing a set of slides based on the book that will be posted in the second half of 2012. More information about the book can be found at http://www.cambridge.org/9781107008731/. The previous (and obsolete) version of the lecture notes can be found at http://arxiv.org/abs/1001.3404v4/

    Network Information Flow with Correlated Sources

    Full text link
    In this paper, we consider a network communications problem in which multiple correlated sources must be delivered to a single data collector node, over a network of noisy independent point-to-point channels. We prove that perfect reconstruction of all the sources at the sink is possible if and only if, for all partitions of the network nodes into two subsets S and S^c such that the sink is always in S^c, we have that H(U_S|U_{S^c}) < \sum_{i\in S,j\in S^c} C_{ij}. Our main finding is that in this setup a general source/channel separation theorem holds, and that Shannon information behaves as a classical network flow, identical in nature to the flow of water in pipes. At first glance, it might seem surprising that separation holds in a fairly general network situation like the one we study. A closer look, however, reveals that the reason for this is that our model allows only for independent point-to-point channels between pairs of nodes, and not multiple-access and/or broadcast channels, for which separation is well known not to hold. This ``information as flow'' view provides an algorithmic interpretation for our results, among which perhaps the most important one is the optimality of implementing codes using a layered protocol stack.Comment: Final version, to appear in the IEEE Transactions on Information Theory -- contains (very) minor changes based on the last round of review

    LINE: Large-scale Information Network Embedding

    Full text link
    This paper studies the problem of embedding very large information networks into low-dimensional vector spaces, which is useful in many tasks such as visualization, node classification, and link prediction. Most existing graph embedding methods do not scale for real world information networks which usually contain millions of nodes. In this paper, we propose a novel network embedding method called the "LINE," which is suitable for arbitrary types of information networks: undirected, directed, and/or weighted. The method optimizes a carefully designed objective function that preserves both the local and global network structures. An edge-sampling algorithm is proposed that addresses the limitation of the classical stochastic gradient descent and improves both the effectiveness and the efficiency of the inference. Empirical experiments prove the effectiveness of the LINE on a variety of real-world information networks, including language networks, social networks, and citation networks. The algorithm is very efficient, which is able to learn the embedding of a network with millions of vertices and billions of edges in a few hours on a typical single machine. The source code of the LINE is available online.Comment: WWW 201

    Improving information filtering via network manipulation

    Get PDF
    Recommender system is a very promising way to address the problem of overabundant information for online users. Though the information filtering for the online commercial systems received much attention recently, almost all of the previous works are dedicated to design new algorithms and consider the user-item bipartite networks as given and constant information. However, many problems for recommender systems such as the cold-start problem (i.e. low recommendation accuracy for the small degree items) are actually due to the limitation of the underlying user-item bipartite networks. In this letter, we propose a strategy to enhance the performance of the already existing recommendation algorithms by directly manipulating the user-item bipartite networks, namely adding some virtual connections to the networks. Numerical analyses on two benchmark data sets, MovieLens and Netflix, show that our method can remarkably improve the recommendation performance. Specifically, it not only improve the recommendations accuracy (especially for the small degree items), but also help the recommender systems generate more diverse and novel recommendations.Comment: 6 pages, 5 figure
    corecore