3 research outputs found

    Typical Depth of a Digital Search Tree built on a general source

    No full text
    International audienceThe digital search tree (dst) plays a central role in compres-sion algorithms, of Lempel-Ziv type. This important struc-ture can be viewed as a mixing of a digital structure (the trie) with a binary search tree. Its probabilistic analysis is thus involved, even in the case when the text is produced by a simple source (a memoryless source, or a Markov chain). After the seminal paper of Flajolet and Sedgewick (1986) [11] which deals with the memoryless unbiased case, many papers, due to Drmota, Jacquet, Louchard, Prodinger, Sz-pankowski, Tang, published between 1990 and 2005, dealt with general memoryless sources or Markov chains, and per-form the analysis of the main parameters of dst's–namely, internal path length, profile, typical depth– (see for instance [7, 15, 14]). Here, we are interested in a more realistic anal-ysis, when the words are emitted by a general source, where the emission of symbols may depend on the whole previous history. There exist previous analyses of text algorithms or digital structures that have been performed for general sources, for instance for tries ([3, 2]), or for basic sorting and searching algorithms ([22, 4]). However, the case of dig-ital search trees has not yet been considered, and this is the main subject of the paper. The idea of this study is due to Philippe Flajolet and the first steps of the work were per-formed with him, during the end of 2010

    Process convergence for the complexity of Radix Selection on Markov sources

    Get PDF
    A fundamental algorithm for selecting ranks from a finite subset of an ordered set is Radix Selection. This algorithm requires the data to be given as strings of symbols over an ordered alphabet, e.g., binary expansions of real numbers. Its complexity is measured by the number of symbols that have to be read. In this paper the model of independent data identically generated from a Markov chain is considered. The complexity is studied as a stochastic process indexed by the set of infinite strings over the given alphabet. The orders of mean and variance of the complexity and, after normalization, a limit theorem with a centered Gaussian process as limit are derived. This implies an analysis for two standard models for the ranks: uniformly chosen ranks, also called grand averages, and the worst case rank complexities which are of interest in computer science. For uniform data and the asymmetric Bernoulli model (i.e. memoryless sources), we also find weak convergence for the normalized process of complexities when indexed by the ranks while for more general Markov sources these processes are not tight under the standard normalizations.Comment: main results significantly improved, 4 figure
    corecore