1 research outputs found
Analysis and study on text representation to improve the accuracy of the Normalized Compression Distance
The huge amount of information stored in text form makes methods that deal
with texts really interesting. This thesis focuses on dealing with texts using
compression distances. More specifically, the thesis takes a small step towards
understanding both the nature of texts and the nature of compression distances.
Broadly speaking, the way in which this is done is exploring the effects that
several distortion techniques have on one of the most successful distances in
the family of compression distances, the Normalized Compression Distance -NCD-.Comment: PhD Thesis; 202 page