2 research outputs found
Graph Summarization
The continuous and rapid growth of highly interconnected datasets, which are
both voluminous and complex, calls for the development of adequate processing
and analytical techniques. One method for condensing and simplifying such
datasets is graph summarization. It denotes a series of application-specific
algorithms designed to transform graphs into more compact representations while
preserving structural patterns, query answers, or specific property
distributions. As this problem is common to several areas studying graph
topologies, different approaches, such as clustering, compression, sampling, or
influence detection, have been proposed, primarily based on statistical and
optimization methods. The focus of our chapter is to pinpoint the main graph
summarization methods, but especially to focus on the most recent approaches
and novel research trends on this topic, not yet covered by previous surveys.Comment: To appear in the Encyclopedia of Big Data Technologie
Efficient Structural Clustering on Probabilistic Graphs
© 1989-2012 IEEE. Structural clustering is a fundamental graph mining operator which is not only able to find densely-connected clusters, but it can also identify hub vertices and outliers in the graph. Previous structural clustering algorithms are tailored to deterministic graphs. Many real-world graphs, however, are not deterministic, but are probabilistic in nature because the existence of the edge is often inferred using a variety of statistical approaches. In this paper, we formulate the problem of structural clustering on probabilistic graphs, with the aim of finding reliable clusters in a given probabilistic graph. Unlike the traditional structural clustering problem, our problem relies mainly on a novel concept called reliable structural similarity which measures the probability of the similarity between two vertices in the probabilistic graph. We develop a dynamic programming algorithm with several powerful pruning strategies to efficiently compute the reliable structural similarities. With the reliable structural similarities, we adapt an existing solution framework to calculate the structural clustering on probabilistic graphs. Comprehensive experiments on five real-life datasets demonstrate the effectiveness and efficiency of the proposed approaches