5,729 research outputs found
From Relational Data to Graphs: Inferring Significant Links using Generalized Hypergeometric Ensembles
The inference of network topologies from relational data is an important
problem in data analysis. Exemplary applications include the reconstruction of
social ties from data on human interactions, the inference of gene
co-expression networks from DNA microarray data, or the learning of semantic
relationships based on co-occurrences of words in documents. Solving these
problems requires techniques to infer significant links in noisy relational
data. In this short paper, we propose a new statistical modeling framework to
address this challenge. It builds on generalized hypergeometric ensembles, a
class of generative stochastic models that give rise to analytically tractable
probability spaces of directed, multi-edge graphs. We show how this framework
can be used to assess the significance of links in noisy relational data. We
illustrate our method in two data sets capturing spatio-temporal proximity
relations between actors in a social system. The results show that our
analytical framework provides a new approach to infer significant links from
relational data, with interesting perspectives for the mining of data on social
systems.Comment: 10 pages, 8 figures, accepted at SocInfo201
A Generalised Quantifier Theory of Natural Language in Categorical Compositional Distributional Semantics with Bialgebras
Categorical compositional distributional semantics is a model of natural
language; it combines the statistical vector space models of words with the
compositional models of grammar. We formalise in this model the generalised
quantifier theory of natural language, due to Barwise and Cooper. The
underlying setting is a compact closed category with bialgebras. We start from
a generative grammar formalisation and develop an abstract categorical
compositional semantics for it, then instantiate the abstract setting to sets
and relations and to finite dimensional vector spaces and linear maps. We prove
the equivalence of the relational instantiation to the truth theoretic
semantics of generalised quantifiers. The vector space instantiation formalises
the statistical usages of words and enables us to, for the first time, reason
about quantified phrases and sentences compositionally in distributional
semantics
Language and scientific explanation: Where does semantics fit in?
This book discusses the two main construals of the explanatory goals of semantic theories. The first, externalist conception, understands semantic theories in terms of a hermeneutic and interpretive explanatory project. The second, internalist conception, understands semantic theories in terms of the psychological mechanisms in virtue of which meanings are generated. It is argued that a fruitful scientific explanation is one that aims to uncover the underlying mechanisms in virtue of which the observable phenomena are made possible, and that a scientific semantics should be doing just that. If this is the case, then a scientific semantics is unlikely to be externalist, for reasons having to do with the subject matter and form of externalist theories. It is argued that semantics construed hermeneutically is nevertheless a valuable explanatory project
A survey of statistical network models
Networks are ubiquitous in science and have become a focal point for
discussion in everyday life. Formal statistical models for the analysis of
network data have emerged as a major topic of interest in diverse areas of
study, and most of these involve a form of graphical representation.
Probability models on graphs date back to 1959. Along with empirical studies in
social psychology and sociology from the 1960s, these early works generated an
active network community and a substantial literature in the 1970s. This effort
moved into the statistical literature in the late 1970s and 1980s, and the past
decade has seen a burgeoning network literature in statistical physics and
computer science. The growth of the World Wide Web and the emergence of online
networking communities such as Facebook, MySpace, and LinkedIn, and a host of
more specialized professional network communities has intensified interest in
the study of networks and network data. Our goal in this review is to provide
the reader with an entry point to this burgeoning literature. We begin with an
overview of the historical development of statistical network modeling and then
we introduce a number of examples that have been studied in the network
literature. Our subsequent discussion focuses on a number of prominent static
and dynamic network models and their interconnections. We emphasize formal
model descriptions, and pay special attention to the interpretation of
parameters and their estimation. We end with a description of some open
problems and challenges for machine learning and statistics.Comment: 96 pages, 14 figures, 333 reference
- …