research

Recognizing Text Genres with Simple Metrics Using Discriminant Analysis

Abstract

A simple method for categorizing texts into predetermined text genre categories using the statistical standard technique of discriminant analysis is demonstrated with application to the Brown corpus. Discriminant analysis makes it possible use a large number of parameters that may be specific for a certain corpus or information stream, and combine them into a small number of functions, with the parameters weighted on basis of how useful they are for discriminating text genres. An application to information retrieval is discussed.Comment: 6 pages, LaTeX, In proceedings of COLING 9

    Similar works