Natural Language Processing (NLP) has been an intensively studied subject of Arti cial Intelligence for decades. There are many powerful techniques available nowadays that boost the computer's ability to process human languages, especially in text format. These techniques are commonly applied in machine translation, search engines, querying and information retrieval systems etc. However, in some application areas, there is still great potential of enhancing NLP development through the use of machine learning techniques. This report relates to a new developing topic of recent years extracting times and events from news articles. By following the TimeML annotating speci cation, I developed a Java artifact, which is trained and evaluated on tagged news articles from TimeBank corpus. This is done either through a 'Dictionary Lookup ' baseline algorithm or variants of Naive Bayes algorithm. A comparison between these is presented in the experiment section. The results show that, by applying Tokenization, Part-of-Speech Tagging, self-prediction using TimeML tagging and other techniques, one variant of the Naive Bayes algorithm gives signi cantly better performance in this classi cation task. 2
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.