30,743 research outputs found

    Compressive Network Analysis

    Full text link
    Modern data acquisition routinely produces massive amounts of network data. Though many methods and models have been proposed to analyze such data, the research of network data is largely disconnected with the classical theory of statistical learning and signal processing. In this paper, we present a new framework for modeling network data, which connects two seemingly different areas: network data analysis and compressed sensing. From a nonparametric perspective, we model an observed network using a large dictionary. In particular, we consider the network clique detection problem and show connections between our formulation with a new algebraic tool, namely Randon basis pursuit in homogeneous spaces. Such a connection allows us to identify rigorous recovery conditions for clique detection problems. Though this paper is mainly conceptual, we also develop practical approximation algorithms for solving empirical problems and demonstrate their usefulness on real-world datasets

    On a stronger reconstruction notion for monoids and clones

    Full text link
    Motivated by reconstruction results by Rubin, we introduce a new reconstruction notion for permutation groups, transformation monoids and clones, called automatic action compatibility, which entails automatic homeomorphicity. We further give a characterization of automatic homeomorphicity for transformation monoids on arbitrary carriers with a dense group of invertibles having automatic homeomorphicity. We then show how to lift automatic action compatibility from groups to monoids and from monoids to clones under fairly weak assumptions. We finally employ these theorems to get automatic action compatibility results for monoids and clones over several well-known countable structures, including the strictly ordered rationals, the directed and undirected version of the random graph, the random tournament and bipartite graph, the generic strictly ordered set, and the directed and undirected versions of the universal homogeneous Henson graphs.Comment: 29 pp; Changes v1-->v2::typos corr.|L3.5+pf extended|Rem3.7 added|C. Pech found out that arg of L5.3-v1 solved Probl2-v1|L5.3, C5.4, Probl2 of v1 removed|C5.2, R5.4 new, contain parts of pf of L5.3-v1|L5.2-v1 is now L5.3,merged with concl of C5.4-v1,L5.3-v2 extends C5.4-v1|abstract, intro updated|ref[24] added|part of L5.3-v1 is L2.1(e)-v2, another part merged with pf of L5.2-v1 => L5.3-v

    Information Recovery from Pairwise Measurements

    Full text link
    A variety of information processing tasks in practice involve recovering nn objects from single-shot graph-based measurements, particularly those taken over the edges of some measurement graph G\mathcal{G}. This paper concerns the situation where each object takes value over a group of MM different values, and where one is interested to recover all these values based on observations of certain pairwise relations over G\mathcal{G}. The imperfection of measurements presents two major challenges for information recovery: 1) inaccuracy\textit{inaccuracy}: a (dominant) portion 1−p1-p of measurements are corrupted; 2) incompleteness\textit{incompleteness}: a significant fraction of pairs are unobservable, i.e. G\mathcal{G} can be highly sparse. Under a natural random outlier model, we characterize the minimax recovery rate\textit{minimax recovery rate}, that is, the critical threshold of non-corruption rate pp below which exact information recovery is infeasible. This accommodates a very general class of pairwise relations. For various homogeneous random graph models (e.g. Erdos Renyi random graphs, random geometric graphs, small world graphs), the minimax recovery rate depends almost exclusively on the edge sparsity of the measurement graph G\mathcal{G} irrespective of other graphical metrics. This fundamental limit decays with the group size MM at a square root rate before entering a connectivity-limited regime. Under the Erdos Renyi random graph, a tractable combinatorial algorithm is proposed to approach the limit for large MM (M=nΩ(1)M=n^{\Omega(1)}), while order-optimal recovery is enabled by semidefinite programs in the small MM regime. The extended (and most updated) version of this work can be found at (http://arxiv.org/abs/1504.01369).Comment: This version is no longer updated -- please find the latest version at (arXiv:1504.01369

    Classifying pairs with trees for supervised biological network inference

    Full text link
    Networks are ubiquitous in biology and computational approaches have been largely investigated for their inference. In particular, supervised machine learning methods can be used to complete a partially known network by integrating various measurements. Two main supervised frameworks have been proposed: the local approach, which trains a separate model for each network node, and the global approach, which trains a single model over pairs of nodes. Here, we systematically investigate, theoretically and empirically, the exploitation of tree-based ensemble methods in the context of these two approaches for biological network inference. We first formalize the problem of network inference as classification of pairs, unifying in the process homogeneous and bipartite graphs and discussing two main sampling schemes. We then present the global and the local approaches, extending the later for the prediction of interactions between two unseen network nodes, and discuss their specializations to tree-based ensemble methods, highlighting their interpretability and drawing links with clustering techniques. Extensive computational experiments are carried out with these methods on various biological networks that clearly highlight that these methods are competitive with existing methods.Comment: 22 page

    A Tensor Approach to Learning Mixed Membership Community Models

    Get PDF
    Community detection is the task of detecting hidden communities from observed interactions. Guaranteed community detection has so far been mostly limited to models with non-overlapping communities such as the stochastic block model. In this paper, we remove this restriction, and provide guaranteed community detection for a family of probabilistic network models with overlapping communities, termed as the mixed membership Dirichlet model, first introduced by Airoldi et al. This model allows for nodes to have fractional memberships in multiple communities and assumes that the community memberships are drawn from a Dirichlet distribution. Moreover, it contains the stochastic block model as a special case. We propose a unified approach to learning these models via a tensor spectral decomposition method. Our estimator is based on low-order moment tensor of the observed network, consisting of 3-star counts. Our learning method is fast and is based on simple linear algebraic operations, e.g. singular value decomposition and tensor power iterations. We provide guaranteed recovery of community memberships and model parameters and present a careful finite sample analysis of our learning method. As an important special case, our results match the best known scaling requirements for the (homogeneous) stochastic block model

    The random graph

    Full text link
    Erd\H{o}s and R\'{e}nyi showed the paradoxical result that there is a unique (and highly symmetric) countably infinite random graph. This graph, and its automorphism group, form the subject of the present survey.Comment: Revised chapter for new edition of book "The Mathematics of Paul Erd\H{o}s
    • …
    corecore