2 research outputs found

    Convolutional neural network-based model for web-based text classification

    Get PDF
    There is an increasing amount of text data available on the web with multiple topical granularities; this necessitates proper categorization/classification of text to facilitate obtaining useful information as per the needs of users. Some traditional approaches such as bag-of-words and bag-of-ngrams models provide good results for text classification. However, texts available on the web in the current state contain high event-related granularity on different topics at different levels, which may adversely affect the accuracy of traditional approaches. With the invention of deep learning models, which already have the capability of providing good accuracy in the field of image processing and speech recognition, the problems inherent in the traditional text classification model can be overcome. Currently, there are several deep learning models such as a convolutional neural networks (CNNs), recurrent neural networks (RNNs), and long-short term memory that are widely used for various text-related tasks; however, among them, the CNN model is popular because it is simple to use and has high accuracy for text classification. In this study, classification of random texts on the web into categories is attempted using a CNN-based model by changing the hyperparameters and sequence of text vectors. We attempt to tune every hyperparameter that is unique for the classification task along with the sequences of word vectors to obtain the desired accuracy; the accuracy is found to be in the range of 85–92%. This model can be considered as a reliable model and applied to solve real-world problem or extract useful information for various text mining applications

    Comparing the Effectiveness of Support Vector Machines and Convolutional Neural Networks for Determining User Intent in Conversational Agents

    Get PDF
    Over the last fifty years, conversational agent systems have evolved in their ability to understand natural language input. In recent years Natural Language Processing (NLP) and Machine Learning (ML) have allowed computer systems to make great strides in the area of natural language understanding. However, little research has been carried out in these areas within the context of conversational systems. This paper identifies Convolutional Neural Network (CNN) and Support Vector Machine (SVM) as the two ML algorithms with the best record of performance in ex isting NLP literature, with CNN indicated as generating the better results of the two. A comprehensive experiment is defined where the results of SVM models utilising sev eral kernels are compared to the results of a selection of CNN models. To contextualise the experiment to conversational agents a dataset based on conversational interactions is used. A state of the art NLP pipeline is also created to work with both algorithms in the context of the agent dataset. By conducting a detailed statistical analysis of the results, this paper proposes to provide an extensive indicator as to which algo rithm offers better performance for agent-based systems. Ultimately the experimental results indicate that CNN models do not necessarily generate better results than SVM models. In fact, the SVM model utilising a Radial Basis Function kernel generates statistically better results than all other models considered under these experimental conditions
    corecore