Search CORE

12 research outputs found

Smoothing graphons for modelling exchangeable relational data

Author: Chen L
Fan X
Li B
Li Y
Sisson SA
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/05/2022
Field of study

Modelling exchangeable relational data can be described appropriately in graphon theory. Most Bayesian methods for modelling exchangeable relational data can be attributed to this framework by exploiting different forms of graphons. However, the graphons adopted by existing Bayesian methods are either piecewise-constant functions, which are insufficiently flexible for accurate modelling of the relational data, or are complicated continuous functions, which incur heavy computational costs for inference. In this work, we overcome these two shortcomings by smoothing piecewise-constant graphons, which permits continuous intensity values for describing relations, without impractically increasing computational costs. In particular, we focus on the Bayesian Stochastic Block Model (SBM) and demonstrate how to adapt the piecewise-constant SBM graphon to the smoothed version. We first propose the Integrated Smoothing Graphon (ISG) which introduces one smoothing parameter to the SBM graphon to generate continuous relational intensity values. Then, we further develop the Latent Feature Smoothing Graphon (LFSG), which improves the ISG, by introducing auxiliary hidden labels to decompose the calculation of the ISG intensity and enable efficient inference. Experimental results on real-world data sets validate the advantages of applying smoothing strategies to the Stochastic Block Model, demonstrating that smoothing graphons can greatly improve AUC and precision for link prediction without increasing computational complexity

OPUS - University of Technology Sydney

The Graph Pencil Method: Mapping Subgraph Densities to Stochastic Block Models

Author: Bravo-Hermsdorff Gecia
Gunderson lee M
Orbanz Peter
Publication venue: NeurIPS
Publication date: 21/09/2023
Field of study

In this work, we describe a method that determines an exact map from a finite set of subgraph densities to the parameters of a stochastic block model (SBM) matching these densities. Given a number K of blocks, the subgraph densities of a finite number of stars and bistars uniquely determines a single element of the class of all degree-separated stochastic block models with K blocks. Our method makes it possible to translate estimates of these subgraph densities into model parameters, and hence to use subgraph densities directly for inference. The computational overhead is negligible; computing the translation map is polynomial in K, but independent of the graph size once the subgraph densities are given

UCL Discovery

Statistical Network Analysis: Beyond Block Models.

Author: Zhang Yuan
Publication venue
Publication date
Field of study

Network data represent connections between units of analysis and lead to many interesting research questions with diverse applications. In this thesis, we focus on inferring the structure underlying an observed network, which can be thought of as a noisy random realization of the unobserved true structure. Different applications focus on different types of underlying structure; one question of broad interest is finding a community structure, with communities typically defined as groups of nodes that share similar connectivity patterns. One common and widely used model for describing a community structure in a network is the stochastic block model. This model has attracted a lot of attention because of its tractable theoretical properties, but it is also well known to oversimplify the structure observed in real world networks and often does not fit the data well. Thus there has been a recent push to expand the stochastic block model in various ways to make it closer to what we observe in the real world, and this thesis makes several contributions to this effort.PhDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/133476/1/yzhanghf_1.pd

Deep Blue Documents at the University of Michigan

Differential Privacy, Property Testing, and Perturbations

Author: McMillan Audra
Publication venue
Publication date: 01/01/2018
Field of study

Controlling the dissemination of information about ourselves has become a minefield in the modern age. We release data about ourselves every day and don’t always fully understand what information is contained in this data. It is often the case that the combination of seemingly innocuous pieces of data can be combined to reveal more sensitive information about ourselves than we intended. Differential privacy has developed as a technique to prevent this type of privacy leakage. It borrows ideas from information theory to inject enough uncertainty into the data so that sensitive information is provably absent from the privatised data. Current research in differential privacy walks the fine line between removing sensitive information while allowing non-sensitive information to be released. At its heart, this thesis is about the study of information. Many of the results can be formulated as asking a subset of the questions: does the data you have contain enough information to learn what you would like to learn? and how can I affect the data to ensure you can’t discern sensitive information? We will often approach the former question from both directions: information theoretic lower bounds on recovery and algorithmic upper bounds. We begin with an information theoretic lower bound for graphon estimation. This explores the fundamental limits of how much information about the underlying population is contained in a finite sample of data. We then move on to exploring the connection between information theoretic results and privacy in the context of linear inverse problems. We find that there is a discrepancy between how the inverse problems community and the privacy community view good recovery of information. Next, we explore black-box testing for privacy. We argue that the amount of information required to verify the privacy guarantee of an algorithm, without access to the internals of the algorithm, is lower bounded by the amount of information required to break the privacy guarantee. Finally, we explore a setting where imposing privacy is a help rather than a hindrance: online linear optimisation. We argue that private algorithms have the right kind of stability guarantee to ensure low regret for online linear optimisation.PHDMathematicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/143940/1/amcm_1.pd

Deep Blue Documents at the University of Michigan

Statistical and computational rates in high rank tensor estimation

Author: Lee Chanwoo
Wang Miaoyan
Publication venue
Publication date: 08/04/2023
Field of study

Higher-order tensor datasets arise commonly in recommendation systems, neuroimaging, and social networks. Here we develop probable methods for estimating a possibly high rank signal tensor from noisy observations. We consider a generative latent variable tensor model that incorporates both high rank and low rank models, including but not limited to, simple hypergraphon models, single index models, low-rank CP models, and low-rank Tucker models. Comprehensive results are developed on both the statistical and computational limits for the signal tensor estimation. We find that high-dimensional latent variable tensors are of log-rank; the fact explains the pervasiveness of low-rank tensors in applications. Furthermore, we propose a polynomial-time spectral algorithm that achieves the computationally optimal rate. We show that the statistical-computational gap emerges only for latent variable tensors of order 3 or higher. Numerical experiments and two real data applications are presented to demonstrate the practical merits of our methods.Comment: 38 pages, 8 figure

arXiv.org e-Print Archive

Advancements in latent space network modelling

Author: Turnbull Kathryn
Publication venue: Lancaster University
Publication date: 01/01/2020
Field of study

The ubiquity of relational data has motivated an extensive literature on network analysis, and over the last two decades the latent space approach has become a popular network modelling framework. In this approach, the nodes of a network are represented in a low-dimensional latent space and the probability of interactions occurring are modelled as a function of the associated latent coordinates. This thesis focuses on computational and modelling aspects of the latent space approach, and we present two main contributions. First, we consider estimation of temporally evolving latent space networks in which interactions among a fixed population are observed through time. The latent coordinates of each node evolve other time and this presents a natural setting for the application of sequential monte carlo (SMC) methods. This facilitates online inference which allows estimation for dynamic networks in which the number of observations in time is large. Since the performance of SMC methods degrades as the dimension of the latent state space increases, we explore the high-dimensional SMC literature to allow estimation of networks with a larger number of nodes. Second, we develop a latent space model for network data in which the interactions occur between sets of the population and, as a motivating example, we consider a coauthorship network in which it is typical for more than two authors to contribute to an article. This type of data can be represented as a hypergraph, and we extend the latent space framework to this setting. Modelling the nodes in a latent space provides a convenient visualisation of the data and allows properties to be imposed on the hypergraph relationships. We develop a parsimonious model with a computationally convenient likelihood. Furthermore, we theoretically consider the properties of the degree distribution of our model and further explore its properties via simulation

Lancaster E-Prints

Probabilistic reasoning for uncertainty & compression in deep learning

Author: Louizos C.
Publication venue
Publication date: 01/01/2022
Field of study

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Recommended from our members

Random Walk Models, Preferential Attachment, and Sequential Monte Carlo Methods for Analysis of Network Data

Author: Bloem-Reddy Benjamin Michael
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2017
Field of study

Networks arise in nearly every branch of science, from biology and physics to sociology and economics. A signature of many network datasets is strong local dependence, which gives rise to phenomena such as sparsity, power law degree distributions, clustering, and structural heterogeneity. Statistical models of networks require a careful balance of flexibility to faithfully capture that dependence, and simplicity, to make analysis and inference tractable. In this dissertation, we introduce a class of models that insert one network edge at a time via a random walk, permitting the location of new edges to depend explicitly on the structure of the existing network, while remaining probabilistically and computationally tractable. Connections to graph kernels are made through the probability generating function of the random walk length distribution. The limiting degree distribution is shown to exhibit power law behavior, and the properties of the limiting degree sequence are studied analytically with martingale methods. In the second part of the dissertation, we develop a class of particle Markov chain Monte Carlo algorithms to perform inference for a large class of sequential random graph models, even when the observation consists only of a single graph. Using these methods, we derive a particle Gibbs sampler for random walk models. Fit to synthetic data, the sampler accurately recovers the model parameters; fit to real data, the model offers insight into the typical length scale of dependence in the network, and provides a new measure of vertex centrality. The arrival times of new vertices are the key to obtaining results for both theory and inference. In the third part, we undertake a careful study of the relationship between the arrival times, sparsity, and heavy tailed degree distributions in preferential attachment-type models of partitions and graphs. A number of constructive representations of the limiting degrees are obtained, and connections are made to exchangeable Gibbs partitions as well as to recent results on the limiting degrees of preferential attachment graphs

Columbia University Academic Commons