12,931 research outputs found
Fringe analysis for parallel MacroSplit insertion algorithms in 2--3 trees
We extend the fringe analysis (used to study the expected behavior of balanced search trees under sequential insertions) to deal with synchronous parallel insertions on 2--3 trees. Given an insertion of k keys in a tree with n nodes, the fringe evolves following a transition matrix whose coefficients take care of the precise form of the algorithm but does not depend on k or n. The derivation of this matrix uses the binomial transform recently developed by P. Poblete, J. Munro and Th. Papadakis. Due to the complexity of the preceding exact analysis, we develop also two approximations. A first one based on a simplified parallel model, and a second one based on the sequential model.
These two approximated analysis prove that the parallel insertions case does not differ significantly from the sequential case, namely
on the terms O(1/n^2).Postprint (published version
Finger Search in Grammar-Compressed Strings
Grammar-based compression, where one replaces a long string by a small
context-free grammar that generates the string, is a simple and powerful
paradigm that captures many popular compression schemes. Given a grammar, the
random access problem is to compactly represent the grammar while supporting
random access, that is, given a position in the original uncompressed string
report the character at that position. In this paper we study the random access
problem with the finger search property, that is, the time for a random access
query should depend on the distance between a specified index , called the
\emph{finger}, and the query index . We consider both a static variant,
where we first place a finger and subsequently access indices near the finger
efficiently, and a dynamic variant where also moving the finger such that the
time depends on the distance moved is supported.
Let be the size the grammar, and let be the size of the string. For
the static variant we give a linear space representation that supports placing
the finger in time and subsequently accessing in time,
where is the distance between the finger and the accessed index. For the
dynamic variant we give a linear space representation that supports placing the
finger in time and accessing and moving the finger in time. Compared to the best linear space solution to random
access, we improve a query bound to for the static
variant and to for the dynamic variant, while
maintaining linear space. As an application of our results we obtain an
improved solution to the longest common extension problem in grammar compressed
strings. To obtain our results, we introduce several new techniques of
independent interest, including a novel van Emde Boas style decomposition of
grammars
The fluctuations of the giant cluster for percolation on random split trees
A split tree of cardinality is constructed by distributing "balls" in
a subset of vertices of an infinite tree which encompasses many types of random
trees such as -ary search trees, quad trees, median-of- trees,
fringe-balanced trees, digital search trees and random simplex trees. In this
work, we study Bernoulli bond percolation on arbitrary split trees of large but
finite cardinality . We show for appropriate percolation regimes that depend
on the cardinality of the split tree that there exists a unique giant
cluster, the fluctuations of the size of the giant cluster as are described by an infinitely divisible distribution that belongs to
the class of stable Cauchy laws. This work generalizes the results for the
random -ary recursive trees in Berzunza (2015). Our approach is based on a
remarkable decomposition of the size of the giant percolation cluster as a sum
of essentially independent random variables which may be useful for studying
percolation on other trees with logarithmic height; for instance in this work
we study also the case of regular trees.Comment: 43 page
Optimal prefix codes for pairs of geometrically-distributed random variables
Optimal prefix codes are studied for pairs of independent, integer-valued
symbols emitted by a source with a geometric probability distribution of
parameter , . By encoding pairs of symbols, it is possible to
reduce the redundancy penalty of symbol-by-symbol encoding, while preserving
the simplicity of the encoding and decoding procedures typical of Golomb codes
and their variants. It is shown that optimal codes for these so-called
two-dimensional geometric distributions are \emph{singular}, in the sense that
a prefix code that is optimal for one value of the parameter cannot be
optimal for any other value of . This is in sharp contrast to the
one-dimensional case, where codes are optimal for positive-length intervals of
the parameter . Thus, in the two-dimensional case, it is infeasible to give
a compact characterization of optimal codes for all values of the parameter
, as was done in the one-dimensional case. Instead, optimal codes are
characterized for a discrete sequence of values of that provide good
coverage of the unit interval. Specifically, optimal prefix codes are described
for (), covering the range , and
(), covering the range . The described codes produce the expected
reduction in redundancy with respect to the one-dimensional case, while
maintaining low complexity coding operations.Comment: To appear in IEEE Transactions on Information Theor
B-urns
The fringe of a B-tree with parameter is considered as a particular
P\'olya urn with colors. More precisely, the asymptotic behaviour of this
fringe, when the number of stored keys tends to infinity, is studied through
the composition vector of the fringe nodes. We establish its typical behaviour
together with the fluctuations around it. The well known phase transition in
P\'olya urns has the following effect on B-trees: for , the
fluctuations are asymptotically Gaussian, though for , the
composition vector is oscillating; after scaling, the fluctuations of such an
urn strongly converge to a random variable . This limit is -valued and it does not seem to follow any classical law. Several properties
of are shown: existence of exponential moments, characterization of its
distribution as the solution of a smoothing equation, existence of a density
relatively to the Lebesgue measure on , support of . Moreover, a
few representations of the composition vector for various values of
illustrate the different kinds of convergence
- …