Search CORE

34 research outputs found

The expected profile of digital search trees

Author: Drmota Michael
Szpankowski Wojciech
Publication venue: Elsevier Inc.
Publication date
Field of study

AbstractA digital search tree (DST) is a fundamental data structure on words that finds various applications from the popular Lempel–Zivʼ78 data compression scheme to distributed hash tables. The profile of a DST measures the number of nodes at the same distance from the root; it depends on the number of stored strings and the distance from the root. Most parameters of DST (e.g., depth, height, fillup) can be expressed in terms of the profile. We study here asymptotics of the average profile in a DST built from sequences generated independently by a memoryless source. After representing the average profile by a recurrence, we solve it using a wide range of analytic tools. This analysis is surprisingly demanding but once it is carried out it reveals an unusually intriguing and interesting behavior. The average profile undergoes phase transitions when moving from the root to the longest path: at first it resembles a full tree until it abruptly starts growing polynomially and oscillating in this range. These results are derived by methods of analytic combinatorics such as generating functions, Mellin transform, poissonization and depoissonization, the saddle point method, singularity analysis and uniform asymptotic analysis

Elsevier - Publisher Connector

Phase Transition in the Aldous-Shields Model of Growing Trees

Author: B. Chauvin
B. Pittel
D. Aldous
D. S. Dean
D. Wilkinson
David S. Dean
H. Mahmoud
H.-H. Chern
J. A. Fill
J. A. Fill
J. Vannimenus
J. Ziv
J.-P. Bouchaud
P. Flajolet
P. Flajolet
R. Albert
R. M. Bradley
R. Sedgewick
S. N. Majumdar
Satya N. Majumdar
T. A. Witten
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/10/2005
Field of study

We study analytically the late time statistics of the number of particles in a growing tree model introduced by Aldous and Shields. In this model, a cluster grows in continuous time on a binary Cayley tree, starting from the root, by absorbing new particles at the empty perimeter sites at a rate proportional to c^{-l} where c is a positive parameter and l is the distance of the perimeter site from the root. For c=1, this model corresponds to random binary search trees and for c=2 it corresponds to digital search trees in computer science. By introducing a backward Fokker-Planck approach, we calculate the mean and the variance of the number of particles at large times and show that the variance undergoes a `phase transition' at a critical value c=sqrt{2}. While for c>sqrt{2} the variance is proportional to the mean and the distribution is normal, for c<sqrt{2} the variance is anomalously large and the distribution is non-Gaussian due to the appearance of extreme fluctuations. The model is generalized to one where growth occurs on a tree with

m

branches and, in this more general case, we show that the critical point occurs at c=sqrt{m}.Comment: Latex 17 pages, 6 figure

arXiv.org e-Print Archive

Crossref

HAL-INSA Toulouse

Recommended from our members

Parallel data compression

Author: Hirschberg Daniel S.
Stauffer Lynn M.
Publication venue: eScholarship, University of California
Publication date: 01/05/1991
Field of study

Data compression schemes remove data redundancy in communicated and stored data and increase the effective capacities of communication and storage devices. Parallel algorithms and implementations for textual data compression are surveyed. Related concepts from parallel computation and information theory are briefly discussed. Static and dynamic methods for codeword construction and transmission on various models of parallel computation are described. Included are parallel methods which boost system speed by coding data concurrently, and approaches which employ multiple compression techniques to improve compression ratios. Theoretical and empirical comparisons are reported and areas for future research are suggested

eScholarship - University of California

Compressed and Practical Data Structures for Strings

Author: Christiansen Anders Roy
Publication venue: DTU Compute
Publication date: 01/01/2018
Field of study

Online Research Database In Technology

Asymptotic variance of random symmetric digital search trees

Author: Hsien-Kuei Hwang
Michael Fuchs
Vytas Zacharovas
Publication venue: Discrete Mathematics & Theoretical Computer Science
Publication date: 01/01/2010
Field of study

Dedicated to the 60th birthday of Philippe Flajole

Directory of Open Access Journals

Enumeration of Binary Trees and Universal Types

Author: Charles Knessl
Wojciech Szpankowski
Publication venue: Discrete Mathematics & Theoretical Computer Science
Publication date: 01/01/2005
Field of study

Binary unlabeled ordered trees (further called binary trees) were studied at least since Euler, who enumerated them. The number of such trees with n nodes is now known as the Catalan number. Over the years various interesting questions about the statistics of such trees were investigated (e.g., height and path length distributions for a randomly selected tree). Binary trees find an abundance of applications in computer science. However, recently Seroussi posed a new and interesting problem motivated by information theory considerations: how many binary trees of a \emphgiven path length (sum of depths) are there? This question arose in the study of \emphuniversal types of sequences. Two sequences of length p have the same universal type if they generate the same set of phrases in the incremental parsing of the Lempel-Ziv'78 scheme since one proves that such sequences converge to the same empirical distribution. It turns out that the number of distinct types of sequences of length p corresponds to the number of binary (unlabeled and ordered) trees, T_p, of given path length p (and also the number of distinct Lempel-Ziv'78 parsings of length p sequences). We first show that the number of binary trees with given path length p is asymptotically equal to T_p ~ 2^2p/(log_2 p)(1+O(log ^-2/3 p)). Then we establish various limiting distributions for the number of nodes (number of phrases in the Lempel-Ziv'78 scheme) when a tree is selected randomly among all trees of given path length p. Throughout, we use methods of analytic algorithmics such as generating functions and complex asymptotics, as well as methods of applied mathematics such as the WKB method and matched asymptotics

Directory of Open Access Journals

28th Annual Symposium on Combinatorial Pattern Matching : CPM 2017, July 4-6, 2017, Warsaw, Poland

Author
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik
Publication date: 01/07/2017
Field of study

Peer reviewe

Helsingin yliopiston digitaalinen arkisto