34,051 research outputs found

    On the construction of sparse matrices from expander graphs

    Full text link
    We revisit the asymptotic analysis of probabilistic construction of adjacency matrices of expander graphs proposed in [4]. With better bounds we derived a new reduced sample complexity for the number of nonzeros per column of these matrices, precisely d=O(log⁥s(N/s))d = \mathcal{O}\left(\log_s(N/s) \right); as opposed to the standard d=O(log⁥(N/s))d = \mathcal{O}\left(\log(N/s) \right). This gives insights into why using small dd performed well in numerical experiments involving such matrices. Furthermore, we derive quantitative sampling theorems for our constructions which show our construction outperforming the existing state-of-the-art. We also used our results to compare performance of sparse recovery algorithms where these matrices are used for linear sketching.Comment: 28 pages, 4 figure

    Probabilistic Inductive Classes of Graphs

    Full text link
    Models of complex networks are generally defined as graph stochastic processes in which edges and vertices are added or deleted over time to simulate the evolution of networks. Here, we define a unifying framework - probabilistic inductive classes of graphs - for formalizing and studying evolution of complex networks. Our definition of probabilistic inductive class of graphs (PICG) extends the standard notion of inductive class of graphs (ICG) by imposing a probability space. A PICG is given by: (1) class B of initial graphs, the basis of PICG, (2) class R of generating rules, each with distinguished left element to which the rule is applied to obtain the right element, (3) probability distribution specifying how the initial graph is chosen from class B, (4) probability distribution specifying how the rules from class R are applied, and, finally, (5) probability distribution specifying how the left elements for every rule in class R are chosen. We point out that many of the existing models of growing networks can be cast as PICGs. We present how the well known model of growing networks - the preferential attachment model - can be studied as PICG. As an illustration we present results regarding the size, order, and degree sequence for PICG models of connected and 2-connected graphs.Comment: 15 pages, 6 figure

    Query-Driven Sampling for Collective Entity Resolution

    Full text link
    Probabilistic databases play a preeminent role in the processing and management of uncertain data. Recently, many database research efforts have integrated probabilistic models into databases to support tasks such as information extraction and labeling. Many of these efforts are based on batch oriented inference which inhibits a realtime workflow. One important task is entity resolution (ER). ER is the process of determining records (mentions) in a database that correspond to the same real-world entity. Traditional pairwise ER methods can lead to inconsistencies and low accuracy due to localized decisions. Leading ER systems solve this problem by collectively resolving all records using a probabilistic graphical model and Markov chain Monte Carlo (MCMC) inference. However, for large datasets this is an extremely expensive process. One key observation is that, such exhaustive ER process incurs a huge up-front cost, which is wasteful in practice because most users are interested in only a small subset of entities. In this paper, we advocate pay-as-you-go entity resolution by developing a number of query-driven collective ER techniques. We introduce two classes of SQL queries that involve ER operators --- selection-driven ER and join-driven ER. We implement novel variations of the MCMC Metropolis Hastings algorithm to generate biased samples and selectivity-based scheduling algorithms to support the two classes of ER queries. Finally, we show that query-driven ER algorithms can converge and return results within minutes over a database populated with the extraction from a newswire dataset containing 71 million mentions

    Critical random graphs: limiting constructions and distributional properties

    Get PDF
    We consider the Erdos-Renyi random graph G(n,p) inside the critical window, where p = 1/n + lambda * n^{-4/3} for some lambda in R. We proved in a previous paper (arXiv:0903.4730) that considering the connected components of G(n,p) as a sequence of metric spaces with the graph distance rescaled by n^{-1/3} and letting n go to infinity yields a non-trivial sequence of limit metric spaces C = (C_1, C_2, ...). These limit metric spaces can be constructed from certain random real trees with vertex-identifications. For a single such metric space, we give here two equivalent constructions, both of which are in terms of more standard probabilistic objects. The first is a global construction using Dirichlet random variables and Aldous' Brownian continuum random tree. The second is a recursive construction from an inhomogeneous Poisson point process on R_+. These constructions allow us to characterize the distributions of the masses and lengths in the constituent parts of a limit component when it is decomposed according to its cycle structure. In particular, this strengthens results of Luczak, Pittel and Wierman by providing precise distributional convergence for the lengths of paths between kernel vertices and the length of a shortest cycle, within any fixed limit component.Comment: 30 pages, 4 figure

    Critical random graphs : limiting constructions and distributional properties

    Get PDF
    We consider the Erdos-Renyi random graph G(n, p) inside the critical window, where p = 1/n + lambda n(-4/3) for some lambda is an element of R. We proved in Addario-Berry et al. [2009+] that considering the connected components of G(n, p) as a sequence of metric spaces with the graph distance rescaled by n(-1/3) and letting n -> infinity yields a non-trivial sequence of limit metric spaces C = (C-1, C-2,...). These limit metric spaces can be constructed from certain random real trees with vertex-identifications. For a single such metric space, we give here two equivalent constructions, both of which are in terms of more standard probabilistic objects. The first is a global construction using Dirichlet random variables and Aldous' Brownian continuum random tree. The second is a recursive construction from an inhomogeneous Poisson point process on R+. These constructions allow us to characterize the distributions of the masses and lengths in the constituent parts of a limit component when it is decomposed according to its cycle structure. In particular, this strengthens results of Luczak et al. [1994] by providing precise distributional convergence for the lengths of paths between kernel vertices and the length of a shortest cycle, within any fixed limit component

    Random enriched trees with applications to random graphs

    Full text link
    We establish limit theorems that describe the asymptotic local and global geometric behaviour of random enriched trees considered up to symmetry. We apply these general results to random unlabelled weighted rooted graphs and uniform random unlabelled kk-trees that are rooted at a kk-clique of distinguishable vertices. For both models we establish a Gromov--Hausdorff scaling limit, a Benjamini--Schramm limit, and a local weak limit that describes the asymptotic shape near the fixed root
    • 

    corecore