Location of Repository

Sampling Triples from Restricted Networks Using MCMC Strategy

By Mahmudur Rahman and Mohammad Al Hasan

Abstract

In large networks, the connected triples are useful for solving various tasks including link prediction, community detection, and spam filtering. Existing works in this direction concern mostly with the exact or approximate counting of connected triples that are closed (aka, triangles). Evidently, the task of triple sampling has not been explored in depth, although sampling is a more fundamental task than counting, and the former is useful for solving various other tasks, including counting. In recent years, some works on triple sampling have been proposed that are based on direct sampling, solely for the purpose of triangle count approximation. They sample only from a uniform distribution, and are not effective for sampling triples from an arbitrary user-defined distribution. In this work we present two indirect triple sampling methods that are based on Markov Chain Monte Carlo (MCMC) sampling strategy. Both of the above methods are highly efficient compared to a direct sampling-based method, specifically for the task of sampling from a non-uniform probability distribution. Another significant advantage of the proposed methods is that they can sample triples from networks that have restricted access, on which a direct sampling based method is simply not applicable

Topics: approximate triangle counting, markov chain monte carlo sampling, triple sampling
Publisher: ACM
Year: 2014
DOI identifier: 10.1145/2661829.2662075
OAI identifier: oai:scholarworks.iupui.edu:1805/7786
Provided by: IUPUIScholarWorks

Suggested articles

Preview


To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.