Research on Text Classification Based on Automatically Extracted Keywords

Abstract

Automatic keywords extraction and classification tasks are important research directions in the domains of NLP (natural language processing), information retrieval, and text mining. As the fine granularity abstracted from text data, keywords are also the most important feature of text data, which has great practical and potential value in document classification, topic modeling, information retrieval, and other aspects. The compact representation of documents can be achieved through keywords, which contains massive significant information. Therefore, it may be quite advantageous to realize text classification with high-dimensional feature space. For this reason, this study designed a supervised keyword classification method based on TextRank keyword automatic extraction technology and optimize the model with the genetic algorithm to contribute to modeling the keywords of the topic for text classification

    Similar works