296,847 research outputs found
Exploring Communities in Large Profiled Graphs
Given a graph and a vertex , the community search (CS) problem
aims to efficiently find a subgraph of whose vertices are closely related
to . Communities are prevalent in social and biological networks, and can be
used in product advertisement and social event recommendation. In this paper,
we study profiled community search (PCS), where CS is performed on a profiled
graph. This is a graph in which each vertex has labels arranged in a
hierarchical manner. Extensive experiments show that PCS can identify
communities with themes that are common to their vertices, and is more
effective than existing CS approaches. As a naive solution for PCS is highly
expensive, we have also developed a tree index, which facilitate efficient and
online solutions for PCS
Graph Summarization
The continuous and rapid growth of highly interconnected datasets, which are
both voluminous and complex, calls for the development of adequate processing
and analytical techniques. One method for condensing and simplifying such
datasets is graph summarization. It denotes a series of application-specific
algorithms designed to transform graphs into more compact representations while
preserving structural patterns, query answers, or specific property
distributions. As this problem is common to several areas studying graph
topologies, different approaches, such as clustering, compression, sampling, or
influence detection, have been proposed, primarily based on statistical and
optimization methods. The focus of our chapter is to pinpoint the main graph
summarization methods, but especially to focus on the most recent approaches
and novel research trends on this topic, not yet covered by previous surveys.Comment: To appear in the Encyclopedia of Big Data Technologie
Mining subjectively interesting attributed subgraphs
Community detection in graphs, data clustering, and local pattern mining
are three mature fields of data mining and machine learning.
In recent years, attributed subgraph mining is emerging as a new
powerful data mining task in the intersection of these areas.
Given a graph and a set of attributes for each vertex,
attributed subgraph mining aims to find cohesive subgraphs
for which (a subset of) the attribute values has exceptional values in some sense.
While research on this task can borrow from the three abovementioned fields,
the principled integration of graph and attribute data poses two challenges:
the definition of a pattern language that is intuitive and lends itself to efficient search strategies,
and the formalization of the interestingness of such patterns.
We propose an integrated solution to both of these challenges.
The proposed pattern language improves upon prior work in being both highly flexible and intuitive.
We show how an effective and principled algorithm can enumerate patterns of this language.
The proposed approach for quantifying interestingness of patterns of this language
is rooted in information theory, and is able to account for prior knowledge on the data.
Prior work typically quantifies interestingness based on the cohesion of the subgraph
and for the exceptionality of its attributes separately,
combining these in a parameterized trade-off.
Instead, in our proposal this trade-off is implicitly handled in a principled, parameter-free manner.
Extensive empirical results confirm the proposed pattern syntax is intuitive,
and the interestingness measure aligns well with actual subjective interestingness
- …