193 research outputs found
Efficiently Generating Geometric Inhomogeneous and Hyperbolic Random Graphs
Hyperbolic random graphs (HRG) and geometric inhomogeneous random graphs (GIRG) are two similar generative network models that were designed to resemble complex real world networks. In particular, they have a power-law degree distribution with controllable exponent beta, and high clustering that can be controlled via the temperature T.
We present the first implementation of an efficient GIRG generator running in expected linear time. Besides varying temperatures, it also supports underlying geometries of higher dimensions. It is capable of generating graphs with ten million edges in under a second on commodity hardware. The algorithm can be adapted to HRGs. Our resulting implementation is the fastest sequential HRG generator, despite the fact that we support non-zero temperatures. Though non-zero temperatures are crucial for many applications, most existing generators are restricted to T = 0. We also support parallelization, although this is not the focus of this paper. Moreover, we note that our generators draw from the correct probability distribution, i.e., they involve no approximation.
Besides the generators themselves, we also provide an efficient algorithm to determine the non-trivial dependency between the average degree of the resulting graph and the input parameters of the GIRG model. This makes it possible to specify the desired expected average degree as input.
Moreover, we investigate the differences between HRGs and GIRGs, shedding new light on the nature of the relation between the two models. Although HRGs represent, in a certain sense, a special case of the GIRG model, we find that a straight-forward inclusion does not hold in practice. However, the difference is negligible for most use cases
A Note on the Practicality of Maximal Planar Subgraph Algorithms
Given a graph , the NP-hard Maximum Planar Subgraph problem (MPS) asks for
a planar subgraph of with the maximum number of edges. There are several
heuristic, approximative, and exact algorithms to tackle the problem, but---to
the best of our knowledge---they have never been compared competitively in
practice. We report on an exploratory study on the relative merits of the
diverse approaches, focusing on practical runtime, solution quality, and
implementation complexity. Surprisingly, a seemingly only theoretically strong
approximation forms the building block of the strongest choice.Comment: Appears in the Proceedings of the 24th International Symposium on
Graph Drawing and Network Visualization (GD 2016
Document Retrieval on Repetitive Collections
Document retrieval aims at finding the most important documents where a
pattern appears in a collection of strings. Traditional pattern-matching
techniques yield brute-force document retrieval solutions, which has motivated
the research on tailored indexes that offer near-optimal performance. However,
an experimental study establishing which alternatives are actually better than
brute force, and which perform best depending on the collection
characteristics, has not been carried out. In this paper we address this
shortcoming by exploring the relationship between the nature of the underlying
collection and the performance of current methods. Via extensive experiments we
show that established solutions are often beaten in practice by brute-force
alternatives. We also design new methods that offer superior time/space
trade-offs, particularly on repetitive collections.Comment: Accepted to ESA 2014. Implementation and experiments at
http://www.cs.helsinki.fi/group/suds/rlcsa
Lightweight Lempel-Ziv Parsing
We introduce a new approach to LZ77 factorization that uses O(n/d) words of
working space and O(dn) time for any d >= 1 (for polylogarithmic alphabet
sizes). We also describe carefully engineered implementations of alternative
approaches to lightweight LZ77 factorization. Extensive experiments show that
the new algorithm is superior in most cases, particularly at the lowest memory
levels and for highly repetitive data. As a part of the algorithm, we describe
new methods for computing matching statistics which may be of independent
interest.Comment: 12 page
Suffix Tree of Alignment: An Efficient Index for Similar Data
We consider an index data structure for similar strings. The generalized
suffix tree can be a solution for this. The generalized suffix tree of two
strings and is a compacted trie representing all suffixes in and
. It has leaves and can be constructed in time.
However, if the two strings are similar, the generalized suffix tree is not
efficient because it does not exploit the similarity which is usually
represented as an alignment of and .
In this paper we propose a space/time-efficient suffix tree of alignment
which wisely exploits the similarity in an alignment. Our suffix tree for an
alignment of and has leaves where is the sum of
the lengths of all parts of different from and is the sum of the
lengths of some common parts of and . We did not compromise the pattern
search to reduce the space. Our suffix tree can be searched for a pattern
in time where is the number of occurrences of in and
. We also present an efficient algorithm to construct the suffix tree of
alignment. When the suffix tree is constructed from scratch, the algorithm
requires time where is the sum of the lengths
of other common substrings of and . When the suffix tree of is
already given, it requires time.Comment: 12 page
Computing Covers Using Prefix Tables
An \emph{indeterminate string} on an alphabet is a
sequence of nonempty subsets of ; is said to be \emph{regular} if
every subset is of size one. A proper substring of regular is said to
be a \emph{cover} of iff for every , an occurrence of in
includes . The \emph{cover array} of is
an integer array such that is the longest cover of .
Fifteen years ago a complex, though nevertheless linear-time, algorithm was
proposed to compute the cover array of regular based on prior computation
of the border array of . In this paper we first describe a linear-time
algorithm to compute the cover array of regular string based on the prefix
table of . We then extend this result to indeterminate strings.Comment: 14 pages, 1 figur
Strengthened Lazy Heaps: Surpassing the Lower Bounds for Binary Heaps
Let denote the number of elements currently in a data structure. An
in-place heap is stored in the first locations of an array, uses
extra space, and supports the operations: minimum, insert, and extract-min. We
introduce an in-place heap, for which minimum and insert take worst-case
time, and extract-min takes worst-case time and involves at most
element comparisons. The achieved bounds are optimal to within
additive constant terms for the number of element comparisons. In particular,
these bounds for both insert and extract-min -and the time bound for insert-
surpass the corresponding lower bounds known for binary heaps, though our data
structure is similar. In a binary heap, when viewed as a nearly complete binary
tree, every node other than the root obeys the heap property, i.e. the element
at a node is not smaller than that at its parent. To surpass the lower bound
for extract-min, we reinforce a stronger property at the bottom levels of the
heap that the element at any right child is not smaller than that at its left
sibling. To surpass the lower bound for insert, we buffer insertions and allow
nodes to violate heap order in relation to their parents
- …