531 research outputs found
Distributional convergence for the number of symbol comparisons used by QuickSelect
When the search algorithm QuickSelect compares keys during its execution in
order to find a key of target rank, it must operate on the keys'
representations or internal structures, which were ignored by the previous
studies that quantified the execution cost for the algorithm in terms of the
number of required key comparisons. In this paper, we analyze running costs for
the algorithm that take into account not only the number of key comparisons but
also the cost of each key comparison. We suppose that keys are represented as
sequences of symbols generated by various probabilistic sources and that
QuickSelect operates on individual symbols in order to find the target key. We
identify limiting distributions for the costs and derive integral and series
expressions for the expectations of the limiting distributions. These
expressions are used to recapture previously obtained results on the number of
key comparisons required by the algorithm.Comment: The first paragraph in the proof of Theorem 3.1 has been corrected in
this revision, and references have been update
The mean, variance and limiting distribution of two statistics sensitive to phylogenetic tree balance
For two decades, the Colless index has been the most frequently used
statistic for assessing the balance of phylogenetic trees. In this article,
this statistic is studied under the Yule and uniform model of phylogenetic
trees. The main tool of analysis is a coupling argument with another well-known
index called the Sackin statistic. Asymptotics for the mean, variance and
covariance of these two statistics are obtained, as well as their limiting
joint distribution for large phylogenies. Under the Yule model, the limiting
distribution arises as a solution of a functional fixed point equation. Under
the uniform model, the limiting distribution is the Airy distribution. The
cornerstone of this study is the fact that the probabilistic models for
phylogenetic trees are strongly related to the random permutation and the
Catalan models for binary search trees.Comment: Published at http://dx.doi.org/10.1214/105051606000000547 in the
Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute
of Mathematical Statistics (http://www.imstat.org
QuickHeapsort: Modifications and improved analysis
We present a new analysis for QuickHeapsort splitting it into the analysis of
the partition-phases and the analysis of the heap-phases. This enables us to
consider samples of non-constant size for the pivot selection and leads to
better theoretical bounds for the algorithm. Furthermore we introduce some
modifications of QuickHeapsort, both in-place and using n extra bits. We show
that on every input the expected number of comparisons is n lg n - 0.03n + o(n)
(in-place) respectively n lg n -0.997 n+ o (n). Both estimates improve the
previously known best results. (It is conjectured in Wegener93 that the
in-place algorithm Bottom-Up-Heapsort uses at most n lg n + 0.4 n on average
and for Weak-Heapsort which uses n extra-bits the average number of comparisons
is at most n lg n -0.42n in EdelkampS02.) Moreover, our non-in-place variant
can even compete with index based Heapsort variants (e.g. Rank-Heapsort in
WangW07) and Relaxed-Weak-Heapsort (n lg n -0.9 n+ o (n) comparisons in the
worst case) for which no O(n)-bound on the number of extra bits is known
On smoothed analysis of quicksort and Hoare's find
We provide a smoothed analysis of Hoare's find algorithm, and we revisit the smoothed analysis of quicksort. Hoare's find algorithm - often called quickselect or one-sided quicksort - is an easy-to-implement algorithm for finding the k-th smallest element of a sequence. While the worst-case number of comparisons that Hoare’s find needs is Theta(n^2), the average-case number is Theta(n). We analyze what happens between these two extremes by providing a smoothed analysis. In the first perturbation model, an adversary specifies a sequence of n numbers of [0,1], and then, to each number of the sequence, we add a random number drawn independently from the interval [0,d]. We prove that Hoare's find needs Theta(n/(d+1) sqrt(n/d) + n) comparisons in expectation if the adversary may also specify the target element (even after seeing the perturbed sequence) and slightly fewer comparisons for finding the median. In the second perturbation model, each element is marked with a probability of p, and then a random permutation is applied to the marked elements. We prove that the expected number of comparisons to find the median is Omega((1−p)n/p log n). Finally, we provide lower bounds for the smoothed number of comparisons of quicksort and Hoare’s find for the median-of-three pivot rule, which usually yields faster algorithms than always selecting the first element: The pivot is the median of the first, middle, and last element of the sequence. We show that median-of-three does not yield a significant improvement over the classic rule
- …