689 research outputs found
Succinct Permutation Graphs
We present a succinct, i.e., asymptotically space-optimal, data structure for permutation graphs that supports distance, adjacency, neighborhood and shortest-path queries in optimal time; a variant of our data structure also supports degree queries in time independent of the neighborhood's size at the expense of an -factor overhead in all running times. We show how to generalize our data structure to the class of circular permutation graphs with asymptotically no extra space, while supporting the same queries in optimal time. Furthermore, we develop a similar compact data structure for the special case of bipartite permutation graphs and conjecture that it is succinct for this class. We demonstrate how to execute algorithms directly over our succinct representations for several combinatorial problems on permutation graphs: Clique, Coloring, Independent Set, Hamiltonian Cycle, All-Pair Shortest Paths, and others. Moreover, we initiate the study of semi-local graph representations; a concept that "interpolates" between local labeling schemes and standard "centralized" data structures. We show how to turn some of our data structures into semi-local representations by storing only bits of additional global information, beating the lower bound on distance labeling schemes for permutation graphs
Succinct Data Structures for Families of Interval Graphs
We consider the problem of designing succinct data structures for interval
graphs with vertices while supporting degree, adjacency, neighborhood and
shortest path queries in optimal time in the -bit word RAM
model. The degree query reports the number of incident edges to a given vertex
in constant time, the adjacency query returns true if there is an edge between
two vertices in constant time, the neighborhood query reports the set of all
adjacent vertices in time proportional to the degree of the queried vertex, and
the shortest path query returns a shortest path in time proportional to its
length, thus the running times of these queries are optimal. Towards showing
succinctness, we first show that at least bits
are necessary to represent any unlabeled interval graph with vertices,
answering an open problem of Yang and Pippenger [Proc. Amer. Math. Soc. 2017].
This is augmented by a data structure of size bits while
supporting not only the aforementioned queries optimally but also capable of
executing various combinatorial algorithms (like proper coloring, maximum
independent set etc.) on the input interval graph efficiently. Finally, we
extend our ideas to other variants of interval graphs, for example, proper/unit
interval graphs, k-proper and k-improper interval graphs, and circular-arc
graphs, and design succinct/compact data structures for these graph classes as
well along with supporting queries on them efficiently
Tight and simple Web graph compression
Analysing Web graphs has applications in determining page ranks, fighting Web
spam, detecting communities and mirror sites, and more. This study is however
hampered by the necessity of storing a major part of huge graphs in the
external memory, which prevents efficient random access to edge (hyperlink)
lists. A number of algorithms involving compression techniques have thus been
presented, to represent Web graphs succinctly but also providing random access.
Those techniques are usually based on differential encodings of the adjacency
lists, finding repeating nodes or node regions in the successive lists, more
general grammar-based transformations or 2-dimensional representations of the
binary matrix of the graph. In this paper we present two Web graph compression
algorithms. The first can be seen as engineering of the Boldi and Vigna (2004)
method. We extend the notion of similarity between link lists, and use a more
compact encoding of residuals. The algorithm works on blocks of varying size
(in the number of input lines) and sacrifices access time for better
compression ratio, achieving more succinct graph representation than other
algorithms reported in the literature. The second algorithm works on blocks of
the same size, in the number of input lines, and its key mechanism is merging
the block into a single ordered list. This method achieves much more attractive
space-time tradeoffs.Comment: 15 page
Orderly Spanning Trees with Applications
We introduce and study the {\em orderly spanning trees} of plane graphs. This
algorithmic tool generalizes {\em canonical orderings}, which exist only for
triconnected plane graphs. Although not every plane graph admits an orderly
spanning tree, we provide an algorithm to compute an {\em orderly pair} for any
connected planar graph , consisting of a plane graph of , and an
orderly spanning tree of . We also present several applications of orderly
spanning trees: (1) a new constructive proof for Schnyder's Realizer Theorem,
(2) the first area-optimal 2-visibility drawing of , and (3) the best known
encodings of with O(1)-time query support. All algorithms in this paper run
in linear time.Comment: 25 pages, 7 figures, A preliminary version appeared in Proceedings of
the 12th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2001),
Washington D.C., USA, January 7-9, 2001, pp. 506-51
Entropy-scaling search of massive biological data
Many datasets exhibit a well-defined structure that can be exploited to
design faster search tools, but it is not always clear when such acceleration
is possible. Here, we introduce a framework for similarity search based on
characterizing a dataset's entropy and fractal dimension. We prove that
searching scales in time with metric entropy (number of covering hyperspheres),
if the fractal dimension of the dataset is low, and scales in space with the
sum of metric entropy and information-theoretic entropy (randomness of the
data). Using these ideas, we present accelerated versions of standard tools,
with no loss in specificity and little loss in sensitivity, for use in three
domains---high-throughput drug screening (Ammolite, 150x speedup), metagenomics
(MICA, 3.5x speedup of DIAMOND [3,700x BLASTX]), and protein structure search
(esFragBag, 10x speedup of FragBag). Our framework can be used to achieve
"compressive omics," and the general theory can be readily applied to data
science problems outside of biology.Comment: Including supplement: 41 pages, 6 figures, 4 tables, 1 bo
- …