thesis

Discovering Pattern Using Automata

Abstract

Most documents written by humans are not just a collection of words, sentences and paragraphs combined at random. It is believed that there is a pattern hidden behind those piles of characters that represents the author\u27s style of writing. In the previous works and in this thesis, we assumed that the aforementioned belief was a true statement and tried to discover and represent the pattern by automata machines. We used the Alergia algorithm to form an automaton from a prefix-tree-accepter. By testing, we verified that the Alergia algorithm was correctly implemented in our software. Our tests showed that we captured only the patterns of the collections of single sentences in a book. Unfortunately, that is not the full content of a book. Therefore, establishing variable chopping units or a less forceful chopping approach would be a promising approach

    Similar works