63,395 research outputs found

    Research On Text Classification Based On Deep Neural Network

    Get PDF
    Text classification is one of the classic tasks in the field of natural language processing. The goal is to identify the category to which the text belongs. Text categorization is widely used in email detection, sentiment analysis, topic marking and other fields. However, good text representation is the key to improve the performance of natural language processing tasks such as text classification. Traditional text representation adopts bag-of-words model or vector space model, which not only loses the context information of the text, but also faces the problems of high latitude and high sparsity. In recent years, with the increase of data and the improvement of computing performance, the use of deep learning technology to represent and classify texts has attracted great attention. Convolutional neural network, recurrent neural network and recurrent neural network with attention mechanism are used to represent the text, and then to classify the text and other natural language processing tasks, all of which have better performance than the traditional methods. In this paper, we design two sentence-level text representation and classification models based on the deep network. The details are as follows: (1) Text representation and classification model based on bidirectional cyclic and convolutional neural networks-BRCNN. Brcnn's input is the word vector corresponding to each word in the sentence; After using cyclic neural network to extract word order information in sentences, convolution neural network is used to extract higher-level features of sentences. After convolution, the maximum pool operation is used to obtain sentence vectors. At last, softmax classifier is used for classification. Cyclic neural network can capture the word order information in sentences, while convolutional neural network can extract useful features. Experiments on eight text classification tasks show that BRCNN model can get better text feature representation, and the classification accuracy rate is equal to or higher than that of the prior art.. (2) A text representation and classification model based on attention mechanism and convolutional neural network-ACNN. ACNN model uses the recurrent neural network with attention mechanism to obtain the context vector; Then convolution neural network is used to extract more advanced feature information. The maximum pool operation is adopted to obtain a sentence vector; At last, the softmax classifier is used to classify the text. Experiments on eight text classification benchmark data sets show that ACNN improves the stability of model convergence, and can converge to an optimal or local optimal solution better than BRCNN

    Short text evaluation with neural network

    Get PDF
    The aim of this paper is to present a technique, which uses machine learning to process the short text answers with Hungarian language. The processing is based on a special neural network, the convolutional neural network, which can efficiently process short text answer. To achieve precise classification for training and recall grammatically consistent answers and the conversion of the text to the input are inevitable. To convert the input, continuous bag of words and Skip-Gram models will be used, resulting in a model that will be able to evaluate the Hungarian short text answers

    Text Classification Based on Neural Network Fusion

    Get PDF
    The goal of text classification is to identify the category to which the text belongs. Text categorization is widely used in email detection, sentiment analysis, topic marking and other fields. However, good text representation is the point to improve the capability of NLP tasks. Traditional text representation adopts bag-of-words model or vector space model, which loses the context information of the text and faces the problems of high latitude and high sparsity,. In recent years, with the increase of data and the improvement of computing performance, the use of deep learning technology to represent and classify texts has attracted great attention. Convolutional Neural Network (CNN), Recurrent Neural Network (RNN) and RNN with attention mechanism are used to represent the text, and then to classify the text and other NLP tasks, all of which have better performance than the traditional methods. In this paper, we design two sentence-level models based on the deep network and the details are as follows: (1) Text representation and classification model based on bidirectional RNN and CNN (BRCNN). BRCNN’s input is the word vector corresponding to each word in the sentence; after using RNN to extract word order information in sentences, CNN is used to extract higher-level features of sentences. After convolution, the maximum pool operation is used to obtain sentence vectors. At last, softmax classifier is used for classification. RNN can capture the word order information in sentences, while CNN can extract useful features. Experiments on eight text classification tasks show that BRCNN model can get better text feature representation, and the classification accuracy rate is equal to or higher than that of the prior art. (2) Attention mechanism and CNN (ACNN) model uses the RNN with attention mechanism to obtain the context vector; Then CNN is used to extract more advanced feature information. The maximum pool operation is adopted to obtain a sentence vector; At last, the softmax classifier is used to classify the text. Experiments on eight text classification benchmark data sets show that ACNN improves the stability of model convergence, and can converge to an optimal or local optimal solution better than BRCNN

    Interpretable Architectures and Algorithms for Natural Language Processing

    Get PDF
    Paper V is excluded from the dissertation with respect to copyright.This thesis has two parts: Firstly, we introduce the human level-interpretable models using Tsetlin Machine (TM) for NLP tasks. Secondly, we present an interpretable model using DNNs. The first part combines several architectures of various NLP tasks using TM along with its robustness. We use this model to propose logic-based text classification. We start with basic Word Sense Disambiguation (WSD), where we employ TM to design novel interpretation techniques using the frequency of words in the clause. We then tackle a new problem in NLP, i.e., aspect-based text classification using a novel feature engineering for TM. Since TM operates on Boolean features, it relies on Bag-of-Words (BOW), making it difficult to use pre-trained word embedding like Glove, word2vec, and fasttext. Hence, we designed a Glove embedded TM to significantly enhance the model’s performance. In addition to this, NLP models are sensitive to distribution bias because of spurious correlations. Hence we employ TM to design a robust text classification against spurious correlations. The second part of the thesis consists interpretable model using DNN where we design a simple solution for complex position dependent NLP task. Since TM’s interpretability comes with the cost of performance, we propose an DNN-based architecture using a masking scheme on LSTM/GRU based models that ease the interpretation for humans using the attention mechanism. At last, we take the advantages of both models and design an ensemble model by integrating TM’s interpretable information into DNN for better visualization of attention weights. Our proposed model can be efficiently integrated to have a fully explainable model for NLP that assists trustable AI. Overall, our model shows excellent results and interpretation in several open-sourced NLP datasets. Thus, we believe that by combining the novel interpretation of TM, the masking technique in the neural network, and the integrated ensemble model, we can build a simple yet effective platform for explainable NLP applications wherever necessary.publishedVersio

    Prompt-Based Zero- and Few-Shot Node Classification: A Multimodal Approach

    Full text link
    Multimodal data empowers machine learning models to better understand the world from various perspectives. In this work, we study the combination of \emph{text and graph} modalities, a challenging but understudied combination which is prevalent across multiple settings including citation networks, social media, and the web. We focus on the popular task of node classification using limited labels; in particular, under the zero- and few-shot scenarios. In contrast to the standard pipeline which feeds standard precomputed (e.g., bag-of-words) text features into a graph neural network, we propose \textbf{T}ext-\textbf{A}nd-\textbf{G}raph (TAG) learning, a more deeply multimodal approach that integrates the raw texts and graph topology into the model design, and can effectively learn from limited supervised signals without any meta-learning procedure. TAG is a two-stage model with (1) a prompt- and graph-based module which generates prior logits that can be directly used for zero-shot node classification, and (2) a trainable module that further calibrates these prior logits in a few-shot manner. Experiments on two node classification datasets show that TAG outperforms all the baselines by a large margin in both zero- and few-shot settings.Comment: Work in progres

    J Biomed Inform

    Get PDF
    Syndromic surveillance detects and monitors individual and population health indicators through sources such as emergency department records. Automated classification of these records can improve outbreak detection speed and diagnosis accuracy. Current syndromic systems rely on hand-coded keyword-based methods to parse written fields and may benefit from the use of modern supervised-learning classifier models. In this paper, we implement two recurrent neural network models based on long short-term memory (LSTM) and gated recurrent unit (GRU) cells and compare them to two traditional bag-of-words classifiers: multinomial na\uefve Bayes (MNB) and a support vector machine (SVM). The MNB classifier is one of only two machine learning algorithms currently being used for syndromic surveillance. All four models are trained to predict diagnostic code groups as defined by Clinical Classification Software, first to predict from discharge diagnosis, and then from chief complaint fields. The classifiers are trained on 3.6 million de-identified emergency department records from a single United States jurisdiction. We compare performance of these models primarily using the F| score, and we measure absolute model performance to determine which conditions are the most amenable to surveillance based on chief complaint alone. Using discharge diagnoses, the LSTM classifier performs best, though all models exhibit an F| score above 96.00. Using chief complaints, the GRU performs best (F|\u202f=\u202f47.38), and MNB with bigrams performs worst (F|\u202f=\u202f39.40). We also note that certain syndrome types are easier to detect than others. For example, chief complaints using the GRU model predicts alcohol-related disorders well (F|\u202f=\u202f78.91) but predicts influenza poorly (F|\u202f=\u202f14.80). In all instances, the RNN models outperformed the bag-of-words classifiers suggesting deep learning models could substantially improve the automatic classification of unstructured text for syndromic surveillance.CC999999/ImCDC/Intramural CDC HHSUnited States

    Automated COVID-19 Dialogue System Using a New Deep Learning Network

    Get PDF
    During the coronavirus disease 2019 (COVID-19) pandemic outbreak, it is necessary to apply social distancing measurements and search for an alternative to physical contact due to the spread of viral infection. The interest in task-oriented dialogue systems has grown remarkably in healthcare, using natural language in the dialogue between patients and doctors. However, the doctor’s advice is implicit and unclear in most conversations, and the patient may also be nervous when describing symptoms or may have difficulty describing them. Therefore, the patient’s description of symptoms is insufficient for a diagnosis by doctors. This study aims to provide suitable medical advice based on the patients’ symptoms during the conversation between doctors and patients by proposing a new deep learning method for automated medical dialogue systems. The model is based on an encoder and two stages of learning to make reliable decisions. The encoder extracts important words using text normalization, resulting in two vectors: symptom vectors and doctor utterance vectors. The symptom vectors are represented as a weighted bag-of-words feature. The first stage is used to cluster the patients’ utterances by applying Hopfield network while considering the semantic similarity, whereas the second stage extracts an implicit label as a template of advice using clustering. Additionally, the external evaluation model used the applied feedforward neural network classification algorithm using labels obtained in the second stage. The CovidDialog-English dataset is used to evaluate the model. The experimental results indicate the high performance of the feedforward neural network with an F1-score of 0.972 and presents a comparison of three clusters using the k-nearest neighbours and naïve Bayes-based models
    • 

    corecore