38,950 research outputs found
Compression of Weighted Graphs
We propose to compress weighted graphs (networks), motivated by the observation that large networks of social, biological, or other relations can be complex to handle and visualize. In the process also known as graph simplication, nodes and (unweighted) edges are grouped to supernodes and superedges, respectively, to obtain a smaller graph. We propose models and algorithms for weighted graphs. The interpretation (i.e. decompression) of a compressed, weighted graph is that a pair of original nodes is connected by an edge if their supernodes are connected by one, and that the weight of an edge is approximated to be the weight of the superedge. The compression problem now consists of choosing supernodes, superedges, and superedge weights so that the approximation error is minimized while the amount of compression is maximized. In this paper, we formulate this task as the 'simple weighted graph compression problem'. We then propose a much wider class of tasks under the name of 'generalized weighted graph compression problem'. The generalized task extends the optimization to preserve longer-range connectivities between nodes, not just individual edge weights. We study the properties of these problems and propose a range of algorithms to solve them, with dierent balances between complexity and quality of the result. We evaluate the problems and algorithms experimentally on real networks. The results indicate that weighted graphs can be compressed efficiently with relatively little compression error.Peer reviewe
Lossy Compression of Adjacency Matrices by Graph Filter Banks
This paper proposes a compression framework for adjacency matrices of
weighted graphs based on graph filter banks. Adjacency matrices are widely used
mathematical representations of graphs and are used in various applications in
signal processing, machine learning, and data mining. In many problems of
interest, these adjacency matrices can be large, so efficient compression
methods are crucial. In this paper, we propose a lossy compression of weighted
adjacency matrices, where the binary adjacency information is encoded
losslessly (so the topological information of the graph is preserved) while the
edge weights are compressed lossily. For the edge weight compression, the
target graph is converted into a line graph, whose nodes correspond to the
edges of the original graph, and where the original edge weights are regarded
as a graph signal on the line graph. We then transform the edge weights on the
line graph with a graph filter bank for sparse representation. Experiments on
synthetic data validate the effectiveness of the proposed method by comparing
it with existing lossy matrix compression methods
Weighted Graph Compression using Genetic Algorithms
Networks are a great way to present information. It is easy to see how different objects interact with one another, and the nature of their interaction. However, living in the technological era has led to a massive surge in data. Consequently, it is very common for networks/graphs to be large. When graphs get too large, the computational power and time to process these networks gets expensive and inefficient. This is common in areas such as bioinformatics, epidemic contact tracing, social networks, and many others. Graph compression is the process of merging nodes that are highly connected into one super-node, thus shrinking the graph. The goal of graph compression is to merge nodes while mitigating the amount of information lost during the compression process. Unweighted graphs are largely studied in this area. However, in this thesis, we extend the approaches to compress weighted graphs via genetic algorithms and analyse the compression from an epidemic point of view. It is seen that edge weights provide vital information for graph compression. Not only this, but having meaningful edge weights is important as different weights can lead to different results. Moreover, both the original edge weights and adjusted edge weights produce different results when compared to a widely used community detection algorithm, the Louvain Algorithm. However, the different results may be helpful to public health officials. Lastly, the NSGA-II algorithm was implemented. It was found that NSGA-II is more suitable as a pre-processing tool, in order to find a target compression that introduces a comfortable level of distortion, and then using the single-objective genetic algorithm to achieve an improved solution for the target
Multiscale approach for the network compression-friendly ordering
We present a fast multiscale approach for the network minimum logarithmic
arrangement problem. This type of arrangement plays an important role in a
network compression and fast node/link access operations. The algorithm is of
linear complexity and exhibits good scalability which makes it practical and
attractive for using on large-scale instances. Its effectiveness is
demonstrated on a large set of real-life networks. These networks with
corresponding best-known minimization results are suggested as an open
benchmark for a research community to evaluate new methods for this problem
TopCom: Index for Shortest Distance Query in Directed Graph
Finding shortest distance between two vertices in a graph is an important
problem due to its numerous applications in diverse domains, including
geo-spatial databases, social network analysis, and information retrieval.
Classical algorithms (such as, Dijkstra) solve this problem in polynomial time,
but these algorithms cannot provide real-time response for a large number of
bursty queries on a large graph. So, indexing based solutions that pre-process
the graph for efficiently answering (exactly or approximately) a large number
of distance queries in real-time is becoming increasingly popular. Existing
solutions have varying performance in terms of index size, index building time,
query time, and accuracy. In this work, we propose T OP C OM , a novel
indexing-based solution for exactly answering distance queries. Our experiments
with two of the existing state-of-the-art methods (IS-Label and TreeMap) show
the superiority of T OP C OM over these two methods considering scalability and
query time. Besides, indexing of T OP C OM exploits the DAG (directed acyclic
graph) structure in the graph, which makes it significantly faster than the
existing methods if the SCCs (strongly connected component) of the input graph
are relatively small
Encoding dynamics for multiscale community detection: Markov time sweeping for the Map equation
The detection of community structure in networks is intimately related to
finding a concise description of the network in terms of its modules. This
notion has been recently exploited by the Map equation formalism (M. Rosvall
and C.T. Bergstrom, PNAS, 105(4), pp.1118--1123, 2008) through an
information-theoretic description of the process of coding inter- and
intra-community transitions of a random walker in the network at stationarity.
However, a thorough study of the relationship between the full Markov dynamics
and the coding mechanism is still lacking. We show here that the original Map
coding scheme, which is both block-averaged and one-step, neglects the internal
structure of the communities and introduces an upper scale, the `field-of-view'
limit, in the communities it can detect. As a consequence, Map is well tuned to
detect clique-like communities but can lead to undesirable overpartitioning
when communities are far from clique-like. We show that a signature of this
behavior is a large compression gap: the Map description length is far from its
ideal limit. To address this issue, we propose a simple dynamic approach that
introduces time explicitly into the Map coding through the analysis of the
weighted adjacency matrix of the time-dependent multistep transition matrix of
the Markov process. The resulting Markov time sweeping induces a dynamical
zooming across scales that can reveal (potentially multiscale) community
structure above the field-of-view limit, with the relevant partitions indicated
by a small compression gap.Comment: 10 pages, 6 figure
A Multiscale Pyramid Transform for Graph Signals
Multiscale transforms designed to process analog and discrete-time signals
and images cannot be directly applied to analyze high-dimensional data residing
on the vertices of a weighted graph, as they do not capture the intrinsic
geometric structure of the underlying graph data domain. In this paper, we
adapt the Laplacian pyramid transform for signals on Euclidean domains so that
it can be used to analyze high-dimensional data residing on the vertices of a
weighted graph. Our approach is to study existing methods and develop new
methods for the four fundamental operations of graph downsampling, graph
reduction, and filtering and interpolation of signals on graphs. Equipped with
appropriate notions of these operations, we leverage the basic multiscale
constructs and intuitions from classical signal processing to generate a
transform that yields both a multiresolution of graphs and an associated
multiresolution of a graph signal on the underlying sequence of graphs.Comment: 16 pages, 13 figure
- …