18,943 research outputs found
Low Space External Memory Construction of the Succinct Permuted Longest Common Prefix Array
The longest common prefix (LCP) array is a versatile auxiliary data structure
in indexed string matching. It can be used to speed up searching using the
suffix array (SA) and provides an implicit representation of the topology of an
underlying suffix tree. The LCP array of a string of length can be
represented as an array of length words, or, in the presence of the SA, as
a bit vector of bits plus asymptotically negligible support data
structures. External memory construction algorithms for the LCP array have been
proposed, but those proposed so far have a space requirement of words
(i.e. bits) in external memory. This space requirement is in some
practical cases prohibitively expensive. We present an external memory
algorithm for constructing the bit version of the LCP array which uses
bits of additional space in external memory when given a
(compressed) BWT with alphabet size and a sampled inverse suffix array
at sampling rate . This is often a significant space gain in
practice where is usually much smaller than or even constant. We
also consider the case of computing succinct LCP arrays for circular strings
On the Hybrid Extension of CTL and CTL+
The paper studies the expressivity, relative succinctness and complexity of
satisfiability for hybrid extensions of the branching-time logics CTL and CTL+
by variables. Previous complexity results show that only fragments with one
variable do have elementary complexity. It is shown that H1CTL+ and H1CTL, the
hybrid extensions with one variable of CTL+ and CTL, respectively, are
expressively equivalent but H1CTL+ is exponentially more succinct than H1CTL.
On the other hand, HCTL+, the hybrid extension of CTL with arbitrarily many
variables does not capture CTL*, as it even cannot express the simple CTL*
property EGFp. The satisfiability problem for H1CTL+ is complete for triply
exponential time, this remains true for quite weak fragments and quite strong
extensions of the logic
The Case for Learned Index Structures
Indexes are models: a B-Tree-Index can be seen as a model to map a key to the
position of a record within a sorted array, a Hash-Index as a model to map a
key to a position of a record within an unsorted array, and a BitMap-Index as a
model to indicate if a data record exists or not. In this exploratory research
paper, we start from this premise and posit that all existing index structures
can be replaced with other types of models, including deep-learning models,
which we term learned indexes. The key idea is that a model can learn the sort
order or structure of lookup keys and use this signal to effectively predict
the position or existence of records. We theoretically analyze under which
conditions learned indexes outperform traditional index structures and describe
the main challenges in designing learned index structures. Our initial results
show, that by using neural nets we are able to outperform cache-optimized
B-Trees by up to 70% in speed while saving an order-of-magnitude in memory over
several real-world data sets. More importantly though, we believe that the idea
of replacing core components of a data management system through learned models
has far reaching implications for future systems designs and that this work
just provides a glimpse of what might be possible
- …