2 research outputs found

    Fast convergence with a greedy tag-phrase dictionary

    Get PDF
    The best general-purpose compression schemes make their gains by estimating a probability distribution over all possible next symbols given the context established by some number of previous symbols. Such context models typically obtain good compression results for plain text by taking advantage of regularities in character sequences. Frequent word

    Fast Convergence with a Greedy Tag-Phrase Dictionary (Extended Abstract)

    No full text
    ) Tony C. Smith and Ross Peeters Computer Science, University of Waikato, Hamilton, New Zealand. [email protected] 1 Introduction The best general-purpose compression schemes make their gains by estimating a probability distribution over all possible next symbols given the context established by some number of previous symbols. Such context models typically obtain good compression results for plain text by taking advantage of regularities in character sequences. Frequent words and syllables can be incorporated into the model quickly and thereafter used for reasonably accurate prediction. However, the precise context in which frequent patterns emerge is often extremely varied, and each new word or phrase immediately introduces new contexts which can adversely affect the compression rate. A great deal of the structural regularity in a natural language is given rather more by properties of its grammar than by the orthographic transcription of its phonology. This implies that access ..
    corecore