An artificial immune system based on information theory for keyword extraction from text documents

Abstract

This paper presents a model for keyword extraction, extending the basic concepts commonly used in this task, in order to get a formal background that allows determining the importance of the keywords to the documents. The proposed model combines an artificial immune system with a mathematical background based on information theory; this new model has the advantage that does not need any domain knowledge, neither the use of a stopword list or any previous information about the content of the documents. The final result is a set of keywords for each category into the corpus used

    Similar works