1,174 research outputs found
Mining Brain Networks using Multiple Side Views for Neurological Disorder Identification
Mining discriminative subgraph patterns from graph data has attracted great
interest in recent years. It has a wide variety of applications in disease
diagnosis, neuroimaging, etc. Most research on subgraph mining focuses on the
graph representation alone. However, in many real-world applications, the side
information is available along with the graph data. For example, for
neurological disorder identification, in addition to the brain networks derived
from neuroimaging data, hundreds of clinical, immunologic, serologic and
cognitive measures may also be documented for each subject. These measures
compose multiple side views encoding a tremendous amount of supplemental
information for diagnostic purposes, yet are often ignored. In this paper, we
study the problem of discriminative subgraph selection using multiple side
views and propose a novel solution to find an optimal set of subgraph features
for graph classification by exploring a plurality of side views. We derive a
feature evaluation criterion, named gSide, to estimate the usefulness of
subgraph patterns based upon side views. Then we develop a branch-and-bound
algorithm, called gMSV, to efficiently search for optimal subgraph features by
integrating the subgraph mining process and the procedure of discriminative
feature selection. Empirical studies on graph classification tasks for
neurological disorders using brain networks demonstrate that subgraph patterns
selected by the multi-side-view guided subgraph selection approach can
effectively boost graph classification performances and are relevant to disease
diagnosis.Comment: in Proceedings of IEEE International Conference on Data Mining (ICDM)
201
Efficient Subgraph Similarity Search on Large Probabilistic Graph Databases
Many studies have been conducted on seeking the efficient solution for
subgraph similarity search over certain (deterministic) graphs due to its wide
application in many fields, including bioinformatics, social network analysis,
and Resource Description Framework (RDF) data management. All these works
assume that the underlying data are certain. However, in reality, graphs are
often noisy and uncertain due to various factors, such as errors in data
extraction, inconsistencies in data integration, and privacy preserving
purposes. Therefore, in this paper, we study subgraph similarity search on
large probabilistic graph databases. Different from previous works assuming
that edges in an uncertain graph are independent of each other, we study the
uncertain graphs where edges' occurrences are correlated. We formally prove
that subgraph similarity search over probabilistic graphs is #P-complete, thus,
we employ a filter-and-verify framework to speed up the search. In the
filtering phase,we develop tight lower and upper bounds of subgraph similarity
probability based on a probabilistic matrix index, PMI. PMI is composed of
discriminative subgraph features associated with tight lower and upper bounds
of subgraph isomorphism probability. Based on PMI, we can sort out a large
number of probabilistic graphs and maximize the pruning capability. During the
verification phase, we develop an efficient sampling algorithm to validate the
remaining candidates. The efficiency of our proposed solutions has been
verified through extensive experiments.Comment: VLDB201
Anonymizing Social Graphs via Uncertainty Semantics
Rather than anonymizing social graphs by generalizing them to super
nodes/edges or adding/removing nodes and edges to satisfy given privacy
parameters, recent methods exploit the semantics of uncertain graphs to achieve
privacy protection of participating entities and their relationship. These
techniques anonymize a deterministic graph by converting it into an uncertain
form. In this paper, we propose a generalized obfuscation model based on
uncertain adjacency matrices that keep expected node degrees equal to those in
the unanonymized graph. We analyze two recently proposed schemes and show their
fitting into the model. We also point out disadvantages in each method and
present several elegant techniques to fill the gap between them. Finally, to
support fair comparisons, we develop a new tradeoff quantifying framework by
leveraging the concept of incorrectness in location privacy research.
Experiments on large social graphs demonstrate the effectiveness of our
schemes
Node Classification in Uncertain Graphs
In many real applications that use and analyze networked data, the links in
the network graph may be erroneous, or derived from probabilistic techniques.
In such cases, the node classification problem can be challenging, since the
unreliability of the links may affect the final results of the classification
process. If the information about link reliability is not used explicitly, the
classification accuracy in the underlying network may be affected adversely. In
this paper, we focus on situations that require the analysis of the uncertainty
that is present in the graph structure. We study the novel problem of node
classification in uncertain graphs, by treating uncertainty as a first-class
citizen. We propose two techniques based on a Bayes model and automatic
parameter selection, and show that the incorporation of uncertainty in the
classification process as a first-class citizen is beneficial. We
experimentally evaluate the proposed approach using different real data sets,
and study the behavior of the algorithms under different conditions. The
results demonstrate the effectiveness and efficiency of our approach
Injecting Uncertainty in Graphs for Identity Obfuscation
Data collected nowadays by social-networking applications create fascinating
opportunities for building novel services, as well as expanding our
understanding about social structures and their dynamics. Unfortunately,
publishing social-network graphs is considered an ill-advised practice due to
privacy concerns. To alleviate this problem, several anonymization methods have
been proposed, aiming at reducing the risk of a privacy breach on the published
data, while still allowing to analyze them and draw relevant conclusions. In
this paper we introduce a new anonymization approach that is based on injecting
uncertainty in social graphs and publishing the resulting uncertain graphs.
While existing approaches obfuscate graph data by adding or removing edges
entirely, we propose using a finer-grained perturbation that adds or removes
edges partially: this way we can achieve the same desired level of obfuscation
with smaller changes in the data, thus maintaining higher utility. Our
experiments on real-world networks confirm that at the same level of identity
obfuscation our method provides higher usefulness than existing randomized
methods that publish standard graphs.Comment: VLDB201
- …