102,671 research outputs found
Note on islands in path-length sequences of binary trees
An earlier characterization of topologically ordered (lexicographic)
path-length sequences of binary trees is reformulated in terms of an
integrality condition on a scaled Kraft sum of certain subsequences (full
segments, or islands). The scaled Kraft sum is seen to count the set of
ancestors at a certain level of a set of topologically consecutive leaves is a
binary tree.Comment: 4 page
Enumeration of Binary Trees and Universal Types
Binary unlabeled ordered trees (further called binary trees) were studied at least since Euler, who enumerated them. The number of such trees with n nodes is now known as the Catalan number. Over the years various interesting questions about the statistics of such trees were investigated (e.g., height and path length distributions for a randomly selected tree). Binary trees find an abundance of applications in computer science. However, recently Seroussi posed a new and interesting problem motivated by information theory considerations: how many binary trees of a \emphgiven path length (sum of depths) are there? This question arose in the study of \emphuniversal types of sequences. Two sequences of length p have the same universal type if they generate the same set of phrases in the incremental parsing of the Lempel-Ziv'78 scheme since one proves that such sequences converge to the same empirical distribution. It turns out that the number of distinct types of sequences of length p corresponds to the number of binary (unlabeled and ordered) trees, T_p, of given path length p (and also the number of distinct Lempel-Ziv'78 parsings of length p sequences). We first show that the number of binary trees with given path length p is asymptotically equal to T_p ~ 2^2p/(log_2 p)(1+O(log ^-2/3 p)). Then we establish various limiting distributions for the number of nodes (number of phrases in the Lempel-Ziv'78 scheme) when a tree is selected randomly among all trees of given path length p. Throughout, we use methods of analytic algorithmics such as generating functions and complex asymptotics, as well as methods of applied mathematics such as the WKB method and matched asymptotics
Enumeration of General t-ary Trees and Universal Types
We consider t-ary trees characterized by their numbers of nodes and their total path length. When t=2 these are called binary trees, and in such trees a parent node may have up to t child nodes. We give asymptotic expansions for the total number of trees with nodes and path length p, when n and p are large. We consider several different ranges of n and p. For n→∞ and p=O(n^{3/2}) we recover the Airy distribution for the path length in trees with many nodes, and also obtain higher order asymptotic results. For p→∞ and an appropriate range of n we obtain a limiting Gaussian distribution for the number of nodes in trees with large path lengths. The mean and variance are expressed in terms of the maximal root of the Airy function. Singular perturbation methods, such as asymptotic matching and WKB type expansions, are used throughout, and they are combined with more standard methods of analytic combinatorics, such as generating functions, singularity analysis, saddle point method, etc. The results are applicable to problems in information theory, that involve data compression schemes which parse long sequence into shorter phrases. Numerical studies show the accuracy of the various asymptotic approximations. Key Words: Trees; Universal Types; Asymptotics; Path Length; Singular Perturbation
The total path length of split trees
We consider the model of random trees introduced by Devroye [SIAM J. Comput.
28 (1999) 409-432]. The model encompasses many important randomized algorithms
and data structures. The pieces of data (items) are stored in a randomized
fashion in the nodes of a tree. The total path length (sum of depths of the
items) is a natural measure of the efficiency of the algorithm/data structure.
Using renewal theory, we prove convergence in distribution of the total path
length toward a distribution characterized uniquely by a fixed point equation.
Our result covers, using a unified approach, many data structures such as
binary search trees, m-ary search trees, quad trees, median-of-(2k+1) trees,
and simplex trees.Comment: Published in at http://dx.doi.org/10.1214/11-AAP812 the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Fringe trees, Crump-Mode-Jagers branching processes and -ary search trees
This survey studies asymptotics of random fringe trees and extended fringe
trees in random trees that can be constructed as family trees of a
Crump-Mode-Jagers branching process, stopped at a suitable time. This includes
random recursive trees, preferential attachment trees, fragmentation trees,
binary search trees and (more generally) -ary search trees, as well as some
other classes of random trees.
We begin with general results, mainly due to Aldous (1991) and Jagers and
Nerman (1984). The general results are applied to fringe trees and extended
fringe trees for several particular types of random trees, where the theory is
developed in detail. In particular, we consider fringe trees of -ary search
trees in detail; this seems to be new.
Various applications are given, including degree distribution, protected
nodes and maximal clades for various types of random trees. Again, we emphasise
results for -ary search trees, and give for example new results on protected
nodes in -ary search trees.
A separate section surveys results on height, saturation level, typical depth
and total path length, due to Devroye (1986), Biggins (1995, 1997) and others.
This survey contains well-known basic results together with some additional
general results as well as many new examples and applications for various
classes of random trees
- …