39,234 research outputs found
Ranking relations using analogies in biological and information networks
Analogical reasoning depends fundamentally on the ability to learn and
generalize about relations between objects. We develop an approach to
relational learning which, given a set of pairs of objects
,
measures how well other pairs A:B fit in with the set . Our work
addresses the following question: is the relation between objects A and B
analogous to those relations found in ? Such questions are
particularly relevant in information retrieval, where an investigator might
want to search for analogous pairs of objects that match the query set of
interest. There are many ways in which objects can be related, making the task
of measuring analogies very challenging. Our approach combines a similarity
measure on function spaces with Bayesian analysis to produce a ranking. It
requires data containing features of the objects of interest and a link matrix
specifying which relationships exist; no further attributes of such
relationships are necessary. We illustrate the potential of our method on text
analysis and information networks. An application on discovering functional
interactions between pairs of proteins is discussed in detail, where we show
that our approach can work in practice even if a small set of protein pairs is
provided.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS321 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Edge Label Inference in Generalized Stochastic Block Models: from Spectral Theory to Impossibility Results
The classical setting of community detection consists of networks exhibiting
a clustered structure. To more accurately model real systems we consider a
class of networks (i) whose edges may carry labels and (ii) which may lack a
clustered structure. Specifically we assume that nodes possess latent
attributes drawn from a general compact space and edges between two nodes are
randomly generated and labeled according to some unknown distribution as a
function of their latent attributes. Our goal is then to infer the edge label
distributions from a partially observed network. We propose a computationally
efficient spectral algorithm and show it allows for asymptotically correct
inference when the average node degree could be as low as logarithmic in the
total number of nodes. Conversely, if the average node degree is below a
specific constant threshold, we show that no algorithm can achieve better
inference than guessing without using the observations. As a byproduct of our
analysis, we show that our model provides a general procedure to construct
random graph models with a spectrum asymptotic to a pre-specified eigenvalue
distribution such as a power-law distribution.Comment: 17 page
A survey of statistical network models
Networks are ubiquitous in science and have become a focal point for
discussion in everyday life. Formal statistical models for the analysis of
network data have emerged as a major topic of interest in diverse areas of
study, and most of these involve a form of graphical representation.
Probability models on graphs date back to 1959. Along with empirical studies in
social psychology and sociology from the 1960s, these early works generated an
active network community and a substantial literature in the 1970s. This effort
moved into the statistical literature in the late 1970s and 1980s, and the past
decade has seen a burgeoning network literature in statistical physics and
computer science. The growth of the World Wide Web and the emergence of online
networking communities such as Facebook, MySpace, and LinkedIn, and a host of
more specialized professional network communities has intensified interest in
the study of networks and network data. Our goal in this review is to provide
the reader with an entry point to this burgeoning literature. We begin with an
overview of the historical development of statistical network modeling and then
we introduce a number of examples that have been studied in the network
literature. Our subsequent discussion focuses on a number of prominent static
and dynamic network models and their interconnections. We emphasize formal
model descriptions, and pay special attention to the interpretation of
parameters and their estimation. We end with a description of some open
problems and challenges for machine learning and statistics.Comment: 96 pages, 14 figures, 333 reference
Co-evolution of Selection and Influence in Social Networks
Many networks are complex dynamical systems, where both attributes of nodes
and topology of the network (link structure) can change with time. We propose a
model of co-evolving networks where both node at- tributes and network
structure evolve under mutual influence. Specifically, we consider a mixed
membership stochastic blockmodel, where the probability of observing a link
between two nodes depends on their current membership vectors, while those
membership vectors themselves evolve in the presence of a link between the
nodes. Thus, the network is shaped by the interaction of stochastic processes
describing the nodes, while the processes themselves are influenced by the
changing network structure. We derive an efficient variational inference
procedure for our model, and validate the model on both synthetic and
real-world data.Comment: In Proc. of the Twenty-Fifth Conference on Artificial Intelligence
(AAAI-11
Integral projection models for species with complex demography
Matrix projection models occupy a central role in population and conservation biology. Matrix models divide a population into discrete classes, even if the structuring trait exhibits continuous variation ( e. g., body size). The integral projection model ( IPM) avoids discrete classes and potential artifacts from arbitrary class divisions, facilitates parsimonious modeling based on smooth relationships between individual state and demographic performance, and can be implemented with standard matrix software. Here, we extend the IPM to species with complex demographic attributes, including dormant and active life stages, cross- classification by several attributes ( e. g., size, age, and condition), and changes between discrete and continuous structure over the life cycle. We present a general model encompassing these cases, numerical methods, and theoretical results, including stable population growth and sensitivity/ elasticity analysis for density- independent models, local stability analysis in density- dependent models, and optimal/ evolutionarily stable strategy life- history analysis. Our presentation centers on an IPM for the thistle Onopordum illyricum based on a 6- year field study. Flowering and death probabilities are size and age dependent, and individuals also vary in a latent attribute affecting survival, but a predictively accurate IPM is completely parameterized by fitting a few regression equations. The online edition of the American Naturalist includes a zip archive of R scripts illustrating our suggested methods
Topics in social network analysis and network science
This chapter introduces statistical methods used in the analysis of social
networks and in the rapidly evolving parallel-field of network science.
Although several instances of social network analysis in health services
research have appeared recently, the majority involve only the most basic
methods and thus scratch the surface of what might be accomplished.
Cutting-edge methods using relevant examples and illustrations in health
services research are provided
- …