Search CORE

8 research outputs found

Tight and simple Web graph compression

Author: Bieniecki Wojciech
Grabowski Szymon
Publication venue
Publication date: 01/01/2010
Field of study

Analysing Web graphs has applications in determining page ranks, fighting Web spam, detecting communities and mirror sites, and more. This study is however hampered by the necessity of storing a major part of huge graphs in the external memory, which prevents efficient random access to edge (hyperlink) lists. A number of algorithms involving compression techniques have thus been presented, to represent Web graphs succinctly but also providing random access. Those techniques are usually based on differential encodings of the adjacency lists, finding repeating nodes or node regions in the successive lists, more general grammar-based transformations or 2-dimensional representations of the binary matrix of the graph. In this paper we present two Web graph compression algorithms. The first can be seen as engineering of the Boldi and Vigna (2004) method. We extend the notion of similarity between link lists, and use a more compact encoding of residuals. The algorithm works on blocks of varying size (in the number of input lines) and sacrifices access time for better compression ratio, achieving more succinct graph representation than other algorithms reported in the literature. The second algorithm works on blocks of the same size, in the number of input lines, and its key mechanism is merging the block into a single ordered list. This method achieves much more attractive space-time tradeoffs.Comment: 15 page

arXiv.org e-Print Archive

CiteSeerX

Layered Label Propagation: A MultiResolution Coordinate-Free Ordering for Compressing Social Networks

Author: Boldi Paolo
Rosa Marco
Santini Massimo
Vigna Sebastiano
Publication venue
Publication date: 01/01/2011
Field of study

We continue the line of research on graph compression started with WebGraph, but we move our focus to the compression of social networks in a proper sense (e.g., LiveJournal): the approaches that have been used for a long time to compress web graphs rely on a specific ordering of the nodes (lexicographical URL ordering) whose extension to general social networks is not trivial. In this paper, we propose a solution that mixes clusterings and orders, and devise a new algorithm, called Layered Label Propagation, that builds on previous work on scalable clustering and can be used to reorder very large graphs (billions of nodes). Our implementation uses overdecomposition to perform aggressively on multi-core architecture, making it possible to reorder graphs of more than 600 millions nodes in a few hours. Experiments performed on a wide array of web graphs and social networks show that combining the order produced by the proposed algorithm with the WebGraph compression framework provides a major increase in compression with respect to all currently known techniques, both on web graphs and on social networks. These improvements make it possible to analyse in main memory significantly larger graphs

arXiv.org e-Print Archive

CiteSeerX

AIR Universita degli studi di Milano

Hierarchical Graph Generation with $K^2$ -trees

Author: Ahn Sungsoo
Jang Yunhui
Kim Dongwoo
Publication venue
Publication date: 30/05/2023
Field of study

Generating graphs from a target distribution is a significant challenge across many domains, including drug discovery and social network analysis. In this work, we introduce a novel graph generation method leveraging

K^2

-tree representation which was originally designed for lossless graph compression. Our motivation stems from the ability of the

K^2

-trees to enable compact generation while concurrently capturing the inherent hierarchical structure of a graph. In addition, we make further contributions by (1) presenting a sequential

K^2

-tree representation that incorporates pruning, flattening, and tokenization processes and (2) introducing a Transformer-based architecture designed to generate the sequence by incorporating a specialized tree positional encoding scheme. Finally, we extensively evaluate our algorithm on four general and two molecular graph datasets to confirm its superiority for graph generation.Comment: 22 pages (10 appendices

arXiv.org e-Print Archive

Compressed Indexes for String Searching in Labeled Graphs

Author: Amir A.
Bender M. A.
Brisaboa N.
Fano R. M.
Hsu B. P.
Manber U.
Muthukrishnan S.
Navarro G.
Overmars M. H.
Ugander J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

Storing and searching large labeled graphs is indeed becoming a key issue in the design of space/time efficient online platforms indexing modern social networks or knowledge graphs. But, as far as we know, all these results are limited to design compressed graph indexes which support basic access operations onto the link structure of the input graph, such as: given a node u, return the adjacency list of u. This paper takes inspiration from the Facebook Unicorn's platform and proposes some compressed-indexing schemes for large graphs whose nodes are labeled with strings of variable length - i.e., node's attributes such as user's (nick-)name - that support sophisticated search operations which involve both the linked structure of the graph and the string content of its nodes. An extensive experimental evaluation over real social networks will show the time and space efficiency of the proposed indexing schemes and their query processing algorithms

CiteSeerX

Crossref

Archivio della Ricerca - Università di Pisa

Permuting Web and Social Graphs

Author: Elias Peter
Fano Robert M.
Knuth Donald E.
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref