287 research outputs found

    Optimal Alphabetic Ternary Trees

    Full text link
    We give a new algorithm to construct optimal alphabetic ternary trees, where every internal node has at most three children. This algorithm generalizes the classic Hu-Tucker algorithm, though the overall computational complexity has yet to be determined

    Sorting a Low-Entropy Sequence

    Full text link
    We give the first sorting algorithm with bounds in terms of higher-order entropies: let SS be a sequence of length mm containing nn distinct elements and let (H_\ell (S)) be the \ellth-order empirical entropy of SS, with (n^{\ell + 1} \log n \in O (m)); our algorithm sorts SS using ((H_\ell (S) + O (1)) m) comparisons

    Codes : unequal probabilities, unequal letter costs

    Get PDF
    The construction of alphabetic prefix codes with unequal letter costs and unequal probabilities is considered. A variant of the noiseless coding theorem is proved giving closely matching lower and upper bounds for the cost of the optimal code. Furthermore, an algorithm is described which constructs a nearly optimal code in linear time

    New Algorithms and Lower Bounds for Sequential-Access Data Compression

    Get PDF
    This thesis concerns sequential-access data compression, i.e., by algorithms that read the input one or more times from beginning to end. In one chapter we consider adaptive prefix coding, for which we must read the input character by character, outputting each character's self-delimiting codeword before reading the next one. We show how to encode and decode each character in constant worst-case time while producing an encoding whose length is worst-case optimal. In another chapter we consider one-pass compression with memory bounded in terms of the alphabet size and context length, and prove a nearly tight tradeoff between the amount of memory we can use and the quality of the compression we can achieve. In a third chapter we consider compression in the read/write streams model, which allows us passes and memory both polylogarithmic in the size of the input. We first show how to achieve universal compression using only one pass over one stream. We then show that one stream is not sufficient for achieving good grammar-based compression. Finally, we show that two streams are necessary and sufficient for achieving entropy-only bounds.Comment: draft of PhD thesi

    GPML: an XML-based standard for the interchange of genetic programming trees

    Get PDF
    We propose a Genetic Programming Markup Language (GPML), an XML based standard for the interchange of genetic programming trees, and outline the benefits such a format would bring in allowing the deployment of trained genetic programming (GP) models in applications as well as the subsidiary benefit of allowing GP researchers to directly share trained trees. We present a formal definition of this standard and describe details of an implementation. In addition, we present a case study where GPML is used to implement a model predictive controller for the control of a building heating plant

    Codes : unequal probabilities, unequal letter costs

    Get PDF
    The construction of alphabetic prefix codes with unequal letter costs and unequal probabilities is considered. A variant of the noiseless coding theorem is proved giving closely matching lower and upper bounds for the cost of the optimal code. Furthermore, an algorithm is described which constructs a nearly optimal code in linear time

    Codes : Unequal probabilities, unequal letter costs

    Full text link
    corecore