38,950 research outputs found

    Compression of Weighted Graphs

    Get PDF
    We propose to compress weighted graphs (networks), motivated by the observation that large networks of social, biological, or other relations can be complex to handle and visualize. In the process also known as graph simplication, nodes and (unweighted) edges are grouped to supernodes and superedges, respectively, to obtain a smaller graph. We propose models and algorithms for weighted graphs. The interpretation (i.e. decompression) of a compressed, weighted graph is that a pair of original nodes is connected by an edge if their supernodes are connected by one, and that the weight of an edge is approximated to be the weight of the superedge. The compression problem now consists of choosing supernodes, superedges, and superedge weights so that the approximation error is minimized while the amount of compression is maximized. In this paper, we formulate this task as the 'simple weighted graph compression problem'. We then propose a much wider class of tasks under the name of 'generalized weighted graph compression problem'. The generalized task extends the optimization to preserve longer-range connectivities between nodes, not just individual edge weights. We study the properties of these problems and propose a range of algorithms to solve them, with dierent balances between complexity and quality of the result. We evaluate the problems and algorithms experimentally on real networks. The results indicate that weighted graphs can be compressed efficiently with relatively little compression error.Peer reviewe

    Lossy Compression of Adjacency Matrices by Graph Filter Banks

    Full text link
    This paper proposes a compression framework for adjacency matrices of weighted graphs based on graph filter banks. Adjacency matrices are widely used mathematical representations of graphs and are used in various applications in signal processing, machine learning, and data mining. In many problems of interest, these adjacency matrices can be large, so efficient compression methods are crucial. In this paper, we propose a lossy compression of weighted adjacency matrices, where the binary adjacency information is encoded losslessly (so the topological information of the graph is preserved) while the edge weights are compressed lossily. For the edge weight compression, the target graph is converted into a line graph, whose nodes correspond to the edges of the original graph, and where the original edge weights are regarded as a graph signal on the line graph. We then transform the edge weights on the line graph with a graph filter bank for sparse representation. Experiments on synthetic data validate the effectiveness of the proposed method by comparing it with existing lossy matrix compression methods

    Weighted Graph Compression using Genetic Algorithms

    Get PDF
    Networks are a great way to present information. It is easy to see how different objects interact with one another, and the nature of their interaction. However, living in the technological era has led to a massive surge in data. Consequently, it is very common for networks/graphs to be large. When graphs get too large, the computational power and time to process these networks gets expensive and inefficient. This is common in areas such as bioinformatics, epidemic contact tracing, social networks, and many others. Graph compression is the process of merging nodes that are highly connected into one super-node, thus shrinking the graph. The goal of graph compression is to merge nodes while mitigating the amount of information lost during the compression process. Unweighted graphs are largely studied in this area. However, in this thesis, we extend the approaches to compress weighted graphs via genetic algorithms and analyse the compression from an epidemic point of view. It is seen that edge weights provide vital information for graph compression. Not only this, but having meaningful edge weights is important as different weights can lead to different results. Moreover, both the original edge weights and adjusted edge weights produce different results when compared to a widely used community detection algorithm, the Louvain Algorithm. However, the different results may be helpful to public health officials. Lastly, the NSGA-II algorithm was implemented. It was found that NSGA-II is more suitable as a pre-processing tool, in order to find a target compression that introduces a comfortable level of distortion, and then using the single-objective genetic algorithm to achieve an improved solution for the target

    Multiscale approach for the network compression-friendly ordering

    Full text link
    We present a fast multiscale approach for the network minimum logarithmic arrangement problem. This type of arrangement plays an important role in a network compression and fast node/link access operations. The algorithm is of linear complexity and exhibits good scalability which makes it practical and attractive for using on large-scale instances. Its effectiveness is demonstrated on a large set of real-life networks. These networks with corresponding best-known minimization results are suggested as an open benchmark for a research community to evaluate new methods for this problem

    TopCom: Index for Shortest Distance Query in Directed Graph

    Get PDF
    Finding shortest distance between two vertices in a graph is an important problem due to its numerous applications in diverse domains, including geo-spatial databases, social network analysis, and information retrieval. Classical algorithms (such as, Dijkstra) solve this problem in polynomial time, but these algorithms cannot provide real-time response for a large number of bursty queries on a large graph. So, indexing based solutions that pre-process the graph for efficiently answering (exactly or approximately) a large number of distance queries in real-time is becoming increasingly popular. Existing solutions have varying performance in terms of index size, index building time, query time, and accuracy. In this work, we propose T OP C OM , a novel indexing-based solution for exactly answering distance queries. Our experiments with two of the existing state-of-the-art methods (IS-Label and TreeMap) show the superiority of T OP C OM over these two methods considering scalability and query time. Besides, indexing of T OP C OM exploits the DAG (directed acyclic graph) structure in the graph, which makes it significantly faster than the existing methods if the SCCs (strongly connected component) of the input graph are relatively small

    Encoding dynamics for multiscale community detection: Markov time sweeping for the Map equation

    Get PDF
    The detection of community structure in networks is intimately related to finding a concise description of the network in terms of its modules. This notion has been recently exploited by the Map equation formalism (M. Rosvall and C.T. Bergstrom, PNAS, 105(4), pp.1118--1123, 2008) through an information-theoretic description of the process of coding inter- and intra-community transitions of a random walker in the network at stationarity. However, a thorough study of the relationship between the full Markov dynamics and the coding mechanism is still lacking. We show here that the original Map coding scheme, which is both block-averaged and one-step, neglects the internal structure of the communities and introduces an upper scale, the `field-of-view' limit, in the communities it can detect. As a consequence, Map is well tuned to detect clique-like communities but can lead to undesirable overpartitioning when communities are far from clique-like. We show that a signature of this behavior is a large compression gap: the Map description length is far from its ideal limit. To address this issue, we propose a simple dynamic approach that introduces time explicitly into the Map coding through the analysis of the weighted adjacency matrix of the time-dependent multistep transition matrix of the Markov process. The resulting Markov time sweeping induces a dynamical zooming across scales that can reveal (potentially multiscale) community structure above the field-of-view limit, with the relevant partitions indicated by a small compression gap.Comment: 10 pages, 6 figure

    A Multiscale Pyramid Transform for Graph Signals

    Get PDF
    Multiscale transforms designed to process analog and discrete-time signals and images cannot be directly applied to analyze high-dimensional data residing on the vertices of a weighted graph, as they do not capture the intrinsic geometric structure of the underlying graph data domain. In this paper, we adapt the Laplacian pyramid transform for signals on Euclidean domains so that it can be used to analyze high-dimensional data residing on the vertices of a weighted graph. Our approach is to study existing methods and develop new methods for the four fundamental operations of graph downsampling, graph reduction, and filtering and interpolation of signals on graphs. Equipped with appropriate notions of these operations, we leverage the basic multiscale constructs and intuitions from classical signal processing to generate a transform that yields both a multiresolution of graphs and an associated multiresolution of a graph signal on the underlying sequence of graphs.Comment: 16 pages, 13 figure
    • …
    corecore