Automatisierte Kontexterkennung in textbasierten Kommunikationsmedien
Abstract
Communication is an important part of learning in the daily life. Conversations may happen for instance face-to-face, or in virtual spaces like e-mail exchanges, news groups or live chats. A communication using computers is usually recorded into special log files. This thesis presents a method that allows the detection of topics in those log files, and to extract the important sections of the communication. This can be used to remove private parts of the communications and provide the relevant parts to a bigger audience. A similar problem exists in the analysis of message threads. A single document may contain multiple topics. Methods of speech recognition as well as text segmentation, clustering, and categorisation have been evaluated and used to create a new method that splits such documents into parts, and identifies the topics of these parts using an ontology. The presented method is currently targeted at texts in English language, but can be extended further to support different languages