5 research outputs found
Graph Summarization
The continuous and rapid growth of highly interconnected datasets, which are
both voluminous and complex, calls for the development of adequate processing
and analytical techniques. One method for condensing and simplifying such
datasets is graph summarization. It denotes a series of application-specific
algorithms designed to transform graphs into more compact representations while
preserving structural patterns, query answers, or specific property
distributions. As this problem is common to several areas studying graph
topologies, different approaches, such as clustering, compression, sampling, or
influence detection, have been proposed, primarily based on statistical and
optimization methods. The focus of our chapter is to pinpoint the main graph
summarization methods, but especially to focus on the most recent approaches
and novel research trends on this topic, not yet covered by previous surveys.Comment: To appear in the Encyclopedia of Big Data Technologie
A Neighborhood-preserving Graph Summarization
We introduce in this paper a new summarization method for large graphs. Our
summarization approach retains only a user-specified proportion of the
neighbors of each node in the graph. Our main aim is to simplify large graphs
so that they can be analyzed and processed effectively while preserving as many
of the node neighborhood properties as possible. Since many graph algorithms
are based on the neighborhood information available for each node, the idea is
to produce a smaller graph which can be used to allow these algorithms to
handle large graphs and run faster while providing good approximations.
Moreover, our compression allows users to control the size of the compressed
graph by adjusting the amount of information loss that can be tolerated. The
experiments conducted on various real and synthetic graphs show that our
compression reduces considerably the size of the graphs. Moreover, we conducted
several experiments on the obtained summaries using various graph algorithms
and applications, such as node embedding, graph classification and shortest
path approximations. The obtained results show interesting trade-offs between
the algorithms runtime speed-up and the precision loss.Comment: 17 pages, 10 figure