43 research outputs found

    Bayesian stochastic blockmodeling

    Full text link
    This chapter provides a self-contained introduction to the use of Bayesian inference to extract large-scale modular structures from network data, based on the stochastic blockmodel (SBM), as well as its degree-corrected and overlapping generalizations. We focus on nonparametric formulations that allow their inference in a manner that prevents overfitting, and enables model selection. We discuss aspects of the choice of priors, in particular how to avoid underfitting via increased Bayesian hierarchies, and we contrast the task of sampling network partitions from the posterior distribution with finding the single point estimate that maximizes it, while describing efficient algorithms to perform either one. We also show how inferring the SBM can be used to predict missing and spurious links, and shed light on the fundamental limitations of the detectability of modular structures in networks.Comment: 44 pages, 16 figures. Code is freely available as part of graph-tool at https://graph-tool.skewed.de . See also the HOWTO at https://graph-tool.skewed.de/static/doc/demos/inference/inference.htm

    Anomalous Edge Detection in Edge Exchangeable Social Network Models

    Full text link
    This paper studies detecting anomalous edges in directed graphs that model social networks. We exploit edge exchangeability as a criterion for distinguishing anomalous edges from normal edges. Then we present an anomaly detector based on conformal prediction theory; this detector has a guaranteed upper bound for false positive rate. In numerical experiments, we show that the proposed algorithm achieves superior performance to baseline methods

    Sampling and Inference for Beta Neutral-to-the-Left Models of Sparse Networks

    Full text link
    Empirical evidence suggests that heavy-tailed degree distributions occurring in many real networks are well-approximated by power laws with exponents \eta that may take values either less than and greater than two. Models based on various forms of exchangeability are able to capture power laws with <2\eta < 2, and admit tractable inference algorithms; we draw on previous results to show that >2\eta > 2 cannot be generated by the forms of exchangeability used in existing random graph models. Preferential attachment models generate power law exponents greater than two, but have been of limited use as statistical models due to the inherent difficulty of performing inference in non-exchangeable models. Motivated by this gap, we design and implement inference algorithms for a recently proposed class of models that generates \eta of all possible values. We show that although they are not exchangeable, these models have probabilistic structure amenable to inference. Our methods make a large class of previously intractable models useful for statistical inference.Comment: Accepted for publication in the proceedings of Conference on Uncertainty in Artificial Intelligence (UAI) 201

    Loan maturity aggregation in interbank lending networks obscures mesoscale structure and economic functions

    Get PDF
    Since the 2007-2009 financial crisis, substantial academic effort has been dedicated to improving our understanding of interbank lending networks (ILNs). Because of data limitations or by choice, the literature largely lacks multiple loan maturities. We employ a complete interbank loan contract dataset to investigate whether maturity details are informative of the network structure. Applying the layered stochastic block model of Peixoto (2015) and other tools from network science on a time series of bilateral loans with multiple maturity layers in the Russian ILN, we find that collapsing all such layers consistently obscures mesoscale structure. The optimal maturity granularity lies between completely collapsing and completely separating the maturity layers and depends on the development phase of the interbank market, with a more developed market requiring more layers for optimal description. Closer inspection of the inferred maturity bins associated with the optimal maturity granularity reveals specific economic functions, from liquidity intermediation to financing. Collapsing a network with multiple underlying maturity layers or extracting one such layer, common in economic research, is therefore not only an incomplete representation of the ILN's mesoscale structure, but also conceals existing economic functions. This holds important insights and opportunities for theoretical and empirical studies on interbank market functioning, contagion, stability, and on the desirable level of regulatory data disclosure

    Modelling Populations of Interaction Networks via Distance Metrics

    Get PDF
    Network data arises through observation of relational information between a collection of entities. Recent work in the literature has independently considered when (i) one observes a sample of networks, connectome data in neuroscience being a ubiquitous example, and (ii) the units of observation within a network are edges or paths, such as emails between people or a series of page visits to a website by a user, often referred to as interaction network data. The intersection of these two cases, however, is yet to be considered. In this paper, we propose a new Bayesian modelling framework to analyse such data. Given a practitioner-specified distance metric between observations, we define families of models through location and scale parameters, akin to a Gaussian distribution, with subsequent inference of model parameters providing reasoned statistical summaries for this non-standard data structure. To facilitate inference, we propose specialised Markov chain Monte Carlo (MCMC) schemes capable of sampling from doubly-intractable posterior distributions over discrete and multi-dimensional parameter spaces. Through simulation studies we confirm the efficacy of our methodology and inference scheme, whilst its application we illustrate via an example analysis of a location-based social network (LSBN) data set