A Neural Network Approach for Text Document Classification and Semantic Text Analytics

Abstract

The nature of data that is being produced on a daily basis is vast and most amount of this data is in unstructured format. Hence, it is necessary to organize this data into different categories such that meaningful knowledge can be derived from such large volumes of data. The proposed methodology consists of a feature selection component and then a neural network classifier. The neural network system is trained against a large variety and of text document so that it can correctly predict the type of document presented as input. A machine learning algorithm is designed to select terms that will serve as basis to differentiate between various categories of topics. The algorithm will also analyse synonyms so that redundant type of information is kept under a same label

    Similar works

    Full text

    thumbnail-image

    Available Versions