148,831 research outputs found
Anonymizing Social Graphs via Uncertainty Semantics
Rather than anonymizing social graphs by generalizing them to super
nodes/edges or adding/removing nodes and edges to satisfy given privacy
parameters, recent methods exploit the semantics of uncertain graphs to achieve
privacy protection of participating entities and their relationship. These
techniques anonymize a deterministic graph by converting it into an uncertain
form. In this paper, we propose a generalized obfuscation model based on
uncertain adjacency matrices that keep expected node degrees equal to those in
the unanonymized graph. We analyze two recently proposed schemes and show their
fitting into the model. We also point out disadvantages in each method and
present several elegant techniques to fill the gap between them. Finally, to
support fair comparisons, we develop a new tradeoff quantifying framework by
leveraging the concept of incorrectness in location privacy research.
Experiments on large social graphs demonstrate the effectiveness of our
schemes
Efficient Subgraph Similarity Search on Large Probabilistic Graph Databases
Many studies have been conducted on seeking the efficient solution for
subgraph similarity search over certain (deterministic) graphs due to its wide
application in many fields, including bioinformatics, social network analysis,
and Resource Description Framework (RDF) data management. All these works
assume that the underlying data are certain. However, in reality, graphs are
often noisy and uncertain due to various factors, such as errors in data
extraction, inconsistencies in data integration, and privacy preserving
purposes. Therefore, in this paper, we study subgraph similarity search on
large probabilistic graph databases. Different from previous works assuming
that edges in an uncertain graph are independent of each other, we study the
uncertain graphs where edges' occurrences are correlated. We formally prove
that subgraph similarity search over probabilistic graphs is #P-complete, thus,
we employ a filter-and-verify framework to speed up the search. In the
filtering phase,we develop tight lower and upper bounds of subgraph similarity
probability based on a probabilistic matrix index, PMI. PMI is composed of
discriminative subgraph features associated with tight lower and upper bounds
of subgraph isomorphism probability. Based on PMI, we can sort out a large
number of probabilistic graphs and maximize the pruning capability. During the
verification phase, we develop an efficient sampling algorithm to validate the
remaining candidates. The efficiency of our proposed solutions has been
verified through extensive experiments.Comment: VLDB201
Mining Maximal Cliques from an Uncertain Graph
We consider mining dense substructures (maximal cliques) from an uncertain
graph, which is a probability distribution on a set of deterministic graphs.
For parameter 0 < {\alpha} < 1, we present a precise definition of an
{\alpha}-maximal clique in an uncertain graph. We present matching upper and
lower bounds on the number of {\alpha}-maximal cliques possible within an
uncertain graph. We present an algorithm to enumerate {\alpha}-maximal cliques
in an uncertain graph whose worst-case runtime is near-optimal, and an
experimental evaluation showing the practical utility of the algorithm.Comment: ICDE 201
FREQUENT PATTERN MINING ON UNCERTAIN GRAPHS
Weakness is typical for a wide extent of affirmed applications, which unavoidably applies to diagram information. Specialist flawed charts are seen in bio-informatics, social affiliations, and so forth This paper rouses the issue of ordinary sub framework mining on single sketchy outlines, and researches two exceptional - probabilistic and expected - semantics to the degree help definitions. Diagram information are poor upon shortcomings in different applications because of inadequacy and imprecision of information. Mining unsure diagram information is semantically not actually identical to and computationally more testing than mining precise chart information. This paper assesses the issue of mining reformist sub chart plans from defective outline information. The reformist sub chart arrangement mining issue is formalized by masterminding another measure called anticipated assistance. A normal mining tally is proposed to discover a concluded arrangement of standard sub diagram designs by permitting a blunder strength on the regular sponsorships of the found sub chart plans. The check utilizes a beneficial measure calculation to pick if a sub diagram model can be yield or not. The savvy and exploratory outcomes show that the assessment is useful, precise and adaptable for enormous crude chart information bases
The Advantage of Evidential Attributes in Social Networks
Nowadays, there are many approaches designed for the task of detecting
communities in social networks. Among them, some methods only consider the
topological graph structure, while others take use of both the graph structure
and the node attributes. In real-world networks, there are many uncertain and
noisy attributes in the graph. In this paper, we will present how we detect
communities in graphs with uncertain attributes in the first step. The
numerical, probabilistic as well as evidential attributes are generated
according to the graph structure. In the second step, some noise will be added
to the attributes. We perform experiments on graphs with different types of
attributes and compare the detection results in terms of the Normalized Mutual
Information (NMI) values. The experimental results show that the clustering
with evidential attributes gives better results comparing to those with
probabilistic and numerical attributes. This illustrates the advantages of
evidential attributes.Comment: 20th International Conference on Information Fusion, Jul 2017, Xi'an,
Chin
Risk-Averse Matchings over Uncertain Graph Databases
A large number of applications such as querying sensor networks, and
analyzing protein-protein interaction (PPI) networks, rely on mining uncertain
graph and hypergraph databases. In this work we study the following problem:
given an uncertain, weighted (hyper)graph, how can we efficiently find a
(hyper)matching with high expected reward, and low risk?
This problem naturally arises in the context of several important
applications, such as online dating, kidney exchanges, and team formation. We
introduce a novel formulation for finding matchings with maximum expected
reward and bounded risk under a general model of uncertain weighted
(hyper)graphs that we introduce in this work. Our model generalizes
probabilistic models used in prior work, and captures both continuous and
discrete probability distributions, thus allowing to handle privacy related
applications that inject appropriately distributed noise to (hyper)edge
weights. Given that our optimization problem is NP-hard, we turn our attention
to designing efficient approximation algorithms. For the case of uncertain
weighted graphs, we provide a -approximation algorithm, and a
-approximation algorithm with near optimal run time. For the case
of uncertain weighted hypergraphs, we provide a
-approximation algorithm, where is the rank of the
hypergraph (i.e., any hyperedge includes at most nodes), that runs in
almost (modulo log factors) linear time.
We complement our theoretical results by testing our approximation algorithms
on a wide variety of synthetic experiments, where we observe in a controlled
setting interesting findings on the trade-off between reward, and risk. We also
provide an application of our formulation for providing recommendations of
teams that are likely to collaborate, and have high impact.Comment: 25 page
A Maximum Variance Approach for Graph Anonymization
Best Paper AwardInternational audienceUncertain graphs, a form of uncertain data, have recently attracted a lot of attention as they can represent inherent uncertainty in collected data. The uncertain graphs pose challenges to conventional data processing techniques and open new research directions. Going in the reserve direction, this paper focuses on the problem of anonymizing a deterministic graph by converting it into an uncertain form. The paper first analyzes drawbacks in a recent uncertainty-based anonymization scheme and then proposes Maximum Variance, a novel approach that provides better tradeoff between privacy and utility. Towards a fair com-parison between the anonymization schemes on graphs, the second con-tribution of this paper is to describe a quantifying framework for graph anonymization by assessing privacy and utility scores of typical schemes in a unified space. The extensive experiments show the effectiveness and efficiency of Maximum Variance on three large real graphs
- …