20,622 research outputs found
The Distribution of heights of binary trees and other simple trees
The number of binary trees of fixed size and given height is estimated asymptotically near the peak of the distribution. There, a local limit theorem with convergence to a theta law is established. Large deviation bounds corresponding to large heights and small heights are also derived. The methods based on the analysis of singular iterations apply to any simple family of trees
Optimal Hierarchical Layouts for Cache-Oblivious Search Trees
This paper proposes a general framework for generating cache-oblivious
layouts for binary search trees. A cache-oblivious layout attempts to minimize
cache misses on any hierarchical memory, independent of the number of memory
levels and attributes at each level such as cache size, line size, and
replacement policy. Recursively partitioning a tree into contiguous subtrees
and prescribing an ordering amongst the subtrees, Hierarchical Layouts
generalize many commonly used layouts for trees such as in-order, pre-order and
breadth-first. They also generalize the various flavors of the van Emde Boas
layout, which have previously been used as cache-oblivious layouts.
Hierarchical Layouts thus unify all previous attempts at deriving layouts for
search trees.
The paper then derives a new locality measure (the Weighted Edge Product)
that mimics the probability of cache misses at multiple levels, and shows that
layouts that reduce this measure perform better. We analyze the various degrees
of freedom in the construction of Hierarchical Layouts, and investigate the
relative effect of each of these decisions in the construction of
cache-oblivious layouts. Optimizing the Weighted Edge Product for complete
binary search trees, we introduce the MinWEP layout, and show that it
outperforms previously used cache-oblivious layouts by almost 20%.Comment: Extended version with proofs added to the appendi
Extreme Value Statistics and Traveling Fronts: Various Applications
An intriguing connection between extreme value statistics and traveling
fronts has been found recently in a number of diverse problems. In this brief
review we outline a few such problems and consider their various applications.Comment: A brief review (6 pages, 2 figures) to appear in Physica A as part of
the proceedings of Statphys-Kolkata IV (2002
The Brownian limit of separable permutations
We study random uniform permutations in an important class of
pattern-avoiding permutations: the separable permutations. We describe the
asymptotics of the number of occurrences of any fixed given pattern in such a
random permutation in terms of the Brownian excursion. In the recent
terminology of permutons, our work can be interpreted as the convergence of
uniform random separable permutations towards a "Brownian separable permuton".Comment: 45 pages, 14 figures, incorporating referee's suggestion
Growth of the Brownian forest
Trees in Brownian excursions have been studied since the late 1980s. Forests
in excursions of Brownian motion above its past minimum are a natural extension
of this notion. In this paper we study a forest-valued Markov process which
describes the growth of the Brownian forest. The key result is a composition
rule for binary Galton--Watson forests with i.i.d. exponential branch lengths.
We give elementary proofs of this composition rule and explain how it is
intimately linked with Williams' decomposition for Brownian motion with drift.Comment: Published at http://dx.doi.org/10.1214/009117905000000422 in the
Annals of Probability (http://www.imstat.org/aop/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Computational aspects of DNA mixture analysis
Statistical analysis of DNA mixtures is known to pose computational
challenges due to the enormous state space of possible DNA profiles. We propose
a Bayesian network representation for genotypes, allowing computations to be
performed locally involving only a few alleles at each step. In addition, we
describe a general method for computing the expectation of a product of
discrete random variables using auxiliary variables and probability propagation
in a Bayesian network, which in combination with the genotype network allows
efficient computation of the likelihood function and various other quantities
relevant to the inference. Lastly, we introduce a set of diagnostic tools for
assessing the adequacy of the model for describing a particular dataset
Asymptotic genealogy of a critical branching process
Consider a continuous-time binary branching process conditioned to have
population size n at some time t, and with a chance p for recording each
extinct individual in the process. Within the family tree of this process, we
consider the smallest subtree containing the genealogy of the extant
individuals together with the genealogy of the recorded extinct individuals. We
introduce a novel representation of such subtrees in terms of a point-process,
and provide asymptotic results on the distribution of this point-process as the
number of extant individuals increases. We motivate the study within the scope
of a coherent analysis for an a priori model for macroevolution.Comment: 30 page
- …