15 research outputs found

    Injecting Uncertainty in Graphs for Identity Obfuscation

    Full text link
    Data collected nowadays by social-networking applications create fascinating opportunities for building novel services, as well as expanding our understanding about social structures and their dynamics. Unfortunately, publishing social-network graphs is considered an ill-advised practice due to privacy concerns. To alleviate this problem, several anonymization methods have been proposed, aiming at reducing the risk of a privacy breach on the published data, while still allowing to analyze them and draw relevant conclusions. In this paper we introduce a new anonymization approach that is based on injecting uncertainty in social graphs and publishing the resulting uncertain graphs. While existing approaches obfuscate graph data by adding or removing edges entirely, we propose using a finer-grained perturbation that adds or removes edges partially: this way we can achieve the same desired level of obfuscation with smaller changes in the data, thus maintaining higher utility. Our experiments on real-world networks confirm that at the same level of identity obfuscation our method provides higher usefulness than existing randomized methods that publish standard graphs.Comment: VLDB201

    Mining Maximal Cliques from an Uncertain Graph

    Get PDF
    We consider mining dense substructures (maximal cliques) from an uncertain graph, which is a probability distribution on a set of deterministic graphs. For parameter 0 < {\alpha} < 1, we present a precise definition of an {\alpha}-maximal clique in an uncertain graph. We present matching upper and lower bounds on the number of {\alpha}-maximal cliques possible within an uncertain graph. We present an algorithm to enumerate {\alpha}-maximal cliques in an uncertain graph whose worst-case runtime is near-optimal, and an experimental evaluation showing the practical utility of the algorithm.Comment: ICDE 201

    Efficient Subgraph Similarity Search on Large Probabilistic Graph Databases

    Full text link
    Many studies have been conducted on seeking the efficient solution for subgraph similarity search over certain (deterministic) graphs due to its wide application in many fields, including bioinformatics, social network analysis, and Resource Description Framework (RDF) data management. All these works assume that the underlying data are certain. However, in reality, graphs are often noisy and uncertain due to various factors, such as errors in data extraction, inconsistencies in data integration, and privacy preserving purposes. Therefore, in this paper, we study subgraph similarity search on large probabilistic graph databases. Different from previous works assuming that edges in an uncertain graph are independent of each other, we study the uncertain graphs where edges' occurrences are correlated. We formally prove that subgraph similarity search over probabilistic graphs is #P-complete, thus, we employ a filter-and-verify framework to speed up the search. In the filtering phase,we develop tight lower and upper bounds of subgraph similarity probability based on a probabilistic matrix index, PMI. PMI is composed of discriminative subgraph features associated with tight lower and upper bounds of subgraph isomorphism probability. Based on PMI, we can sort out a large number of probabilistic graphs and maximize the pruning capability. During the verification phase, we develop an efficient sampling algorithm to validate the remaining candidates. The efficiency of our proposed solutions has been verified through extensive experiments.Comment: VLDB201

    Discovering frequent subgraphs over uncertain graph databases under probabilistic semantics

    No full text

    Enumeration of Maximal Cliques from an Uncertain Graph

    Get PDF
    We consider the enumeration of dense substructures (maximal cliques) from an uncertain graph. For parameter 0 ;a ;1, we define the notion of an a-maximal clique in an uncertain graph. We present matching upper and lower bounds on the number of a-maximal cliques possible within a (uncertain) graph. We present an algorithm to enumerate a-maximal cliques whose worst-case runtime is near-optimal, and an experimental evaluation showing the practical utility of the algorithm

    An Indexing Framework for Queries on Probabilistic Graphs

    Get PDF
    postprin
    corecore