Search CORE

2 research outputs found

Fast convergence with a greedy tag-phrase dictionary

Author: Ross Peeters
Tony C. Smith
Publication venue: IEEE Computer Society
Publication date: 01/11/1997
Field of study

The best general-purpose compression schemes make their gains by estimating a probability distribution over all possible next symbols given the context established by some number of previous symbols. Such context models typically obtain good compression results for plain text by taking advantage of regularities in character sequences. Frequent word

CiteSeerX

Research Commons@Waikato

Fast Convergence with a Greedy Tag-Phrase Dictionary (Extended Abstract)

Author: Ross Peeters
Tony C. Smith
Publication venue: IEEE Computer Society
Publication date
Field of study

) Tony C. Smith and Ross Peeters Computer Science, University of Waikato, Hamilton, New Zealand. [email protected] 1 Introduction The best general-purpose compression schemes make their gains by estimating a probability distribution over all possible next symbols given the context established by some number of previous symbols. Such context models typically obtain good compression results for plain text by taking advantage of regularities in character sequences. Frequent words and syllables can be incorporated into the model quickly and thereafter used for reasonably accurate prediction. However, the precise context in which frequent patterns emerge is often extremely varied, and each new word or phrase immediately introduces new contexts which can adversely affect the compression rate. A great deal of the structural regularity in a natural language is given rather more by properties of its grammar than by the orthographic transcription of its phonology. This implies that access ..

CiteSeerX