30,743 research outputs found
Compressive Network Analysis
Modern data acquisition routinely produces massive amounts of network data.
Though many methods and models have been proposed to analyze such data, the
research of network data is largely disconnected with the classical theory of
statistical learning and signal processing. In this paper, we present a new
framework for modeling network data, which connects two seemingly different
areas: network data analysis and compressed sensing. From a nonparametric
perspective, we model an observed network using a large dictionary. In
particular, we consider the network clique detection problem and show
connections between our formulation with a new algebraic tool, namely Randon
basis pursuit in homogeneous spaces. Such a connection allows us to identify
rigorous recovery conditions for clique detection problems. Though this paper
is mainly conceptual, we also develop practical approximation algorithms for
solving empirical problems and demonstrate their usefulness on real-world
datasets
On a stronger reconstruction notion for monoids and clones
Motivated by reconstruction results by Rubin, we introduce a new
reconstruction notion for permutation groups, transformation monoids and
clones, called automatic action compatibility, which entails automatic
homeomorphicity. We further give a characterization of automatic
homeomorphicity for transformation monoids on arbitrary carriers with a dense
group of invertibles having automatic homeomorphicity. We then show how to lift
automatic action compatibility from groups to monoids and from monoids to
clones under fairly weak assumptions. We finally employ these theorems to get
automatic action compatibility results for monoids and clones over several
well-known countable structures, including the strictly ordered rationals, the
directed and undirected version of the random graph, the random tournament and
bipartite graph, the generic strictly ordered set, and the directed and
undirected versions of the universal homogeneous Henson graphs.Comment: 29 pp; Changes v1-->v2::typos corr.|L3.5+pf extended|Rem3.7 added|C.
Pech found out that arg of L5.3-v1 solved Probl2-v1|L5.3, C5.4, Probl2 of v1
removed|C5.2, R5.4 new, contain parts of pf of L5.3-v1|L5.2-v1 is now
L5.3,merged with concl of C5.4-v1,L5.3-v2 extends C5.4-v1|abstract, intro
updated|ref[24] added|part of L5.3-v1 is L2.1(e)-v2, another part merged with
pf of L5.2-v1 => L5.3-v
Information Recovery from Pairwise Measurements
A variety of information processing tasks in practice involve recovering
objects from single-shot graph-based measurements, particularly those taken
over the edges of some measurement graph . This paper concerns the
situation where each object takes value over a group of different values,
and where one is interested to recover all these values based on observations
of certain pairwise relations over . The imperfection of
measurements presents two major challenges for information recovery: 1)
: a (dominant) portion of measurements are
corrupted; 2) : a significant fraction of pairs are
unobservable, i.e. can be highly sparse.
Under a natural random outlier model, we characterize the , that is, the critical threshold of non-corruption rate
below which exact information recovery is infeasible. This accommodates a very
general class of pairwise relations. For various homogeneous random graph
models (e.g. Erdos Renyi random graphs, random geometric graphs, small world
graphs), the minimax recovery rate depends almost exclusively on the edge
sparsity of the measurement graph irrespective of other graphical
metrics. This fundamental limit decays with the group size at a square root
rate before entering a connectivity-limited regime. Under the Erdos Renyi
random graph, a tractable combinatorial algorithm is proposed to approach the
limit for large (), while order-optimal recovery is
enabled by semidefinite programs in the small regime.
The extended (and most updated) version of this work can be found at
(http://arxiv.org/abs/1504.01369).Comment: This version is no longer updated -- please find the latest version
at (arXiv:1504.01369
Classifying pairs with trees for supervised biological network inference
Networks are ubiquitous in biology and computational approaches have been
largely investigated for their inference. In particular, supervised machine
learning methods can be used to complete a partially known network by
integrating various measurements. Two main supervised frameworks have been
proposed: the local approach, which trains a separate model for each network
node, and the global approach, which trains a single model over pairs of nodes.
Here, we systematically investigate, theoretically and empirically, the
exploitation of tree-based ensemble methods in the context of these two
approaches for biological network inference. We first formalize the problem of
network inference as classification of pairs, unifying in the process
homogeneous and bipartite graphs and discussing two main sampling schemes. We
then present the global and the local approaches, extending the later for the
prediction of interactions between two unseen network nodes, and discuss their
specializations to tree-based ensemble methods, highlighting their
interpretability and drawing links with clustering techniques. Extensive
computational experiments are carried out with these methods on various
biological networks that clearly highlight that these methods are competitive
with existing methods.Comment: 22 page
A Tensor Approach to Learning Mixed Membership Community Models
Community detection is the task of detecting hidden communities from observed
interactions. Guaranteed community detection has so far been mostly limited to
models with non-overlapping communities such as the stochastic block model. In
this paper, we remove this restriction, and provide guaranteed community
detection for a family of probabilistic network models with overlapping
communities, termed as the mixed membership Dirichlet model, first introduced
by Airoldi et al. This model allows for nodes to have fractional memberships in
multiple communities and assumes that the community memberships are drawn from
a Dirichlet distribution. Moreover, it contains the stochastic block model as a
special case. We propose a unified approach to learning these models via a
tensor spectral decomposition method. Our estimator is based on low-order
moment tensor of the observed network, consisting of 3-star counts. Our
learning method is fast and is based on simple linear algebraic operations,
e.g. singular value decomposition and tensor power iterations. We provide
guaranteed recovery of community memberships and model parameters and present a
careful finite sample analysis of our learning method. As an important special
case, our results match the best known scaling requirements for the
(homogeneous) stochastic block model
The random graph
Erd\H{o}s and R\'{e}nyi showed the paradoxical result that there is a unique
(and highly symmetric) countably infinite random graph. This graph, and its
automorphism group, form the subject of the present survey.Comment: Revised chapter for new edition of book "The Mathematics of Paul
Erd\H{o}s
- …