3,555 research outputs found

    Convolution-based neural attention with applications to sentiment classification

    Get PDF
    Neural attention mechanism has achieved many successes in various tasks in natural language processing. However, existing neural attention models based on a densely connected network are loosely related to the attention mechanism found in psychology and neuroscience. Motivated by the finding in neuroscience that human possesses the template-searching attention mechanism, we propose to use convolution operation to simulate attentions and give a mathematical explanation of our neural attention model. We then introduce a new network architecture, which combines a recurrent neural network with our convolution-based attention model and further stacks an attention-based neural model to build a hierarchical sentiment classification model. The experimental results show that our proposed models can capture salient parts of the text to improve the performance of sentiment classification at both the sentence level and the document level

    Classification of Radiology Reports Using Neural Attention Models

    Full text link
    The electronic health record (EHR) contains a large amount of multi-dimensional and unstructured clinical data of significant operational and research value. Distinguished from previous studies, our approach embraces a double-annotated dataset and strays away from obscure "black-box" models to comprehensive deep learning models. In this paper, we present a novel neural attention mechanism that not only classifies clinically important findings. Specifically, convolutional neural networks (CNN) with attention analysis are used to classify radiology head computed tomography reports based on five categories that radiologists would account for in assessing acute and communicable findings in daily practice. The experiments show that our CNN attention models outperform non-neural models, especially when trained on a larger dataset. Our attention analysis demonstrates the intuition behind the classifier's decision by generating a heatmap that highlights attended terms used by the CNN model; this is valuable when potential downstream medical decisions are to be performed by human experts or the classifier information is to be used in cohort construction such as for epidemiological studies

    Active Discriminative Text Representation Learning

    Full text link
    We propose a new active learning (AL) method for text classification with convolutional neural networks (CNNs). In AL, one selects the instances to be manually labeled with the aim of maximizing model performance with minimal effort. Neural models capitalize on word embeddings as representations (features), tuning these to the task at hand. We argue that AL strategies for multi-layered neural models should focus on selecting instances that most affect the embedding space (i.e., induce discriminative word representations). This is in contrast to traditional AL approaches (e.g., entropy-based uncertainty sampling), which specify higher level objectives. We propose a simple approach for sentence classification that selects instances containing words whose embeddings are likely to be updated with the greatest magnitude, thereby rapidly learning discriminative, task-specific embeddings. We extend this approach to document classification by jointly considering: (1) the expected changes to the constituent word representations; and (2) the model's current overall uncertainty regarding the instance. The relative emphasis placed on these criteria is governed by a stochastic process that favors selecting instances likely to improve representations at the outset of learning, and then shifts toward general uncertainty sampling as AL progresses. Empirical results show that our method outperforms baseline AL approaches on both sentence and document classification tasks. We also show that, as expected, the method quickly learns discriminative word embeddings. To the best of our knowledge, this is the first work on AL addressing neural models for text classification.Comment: This paper got accepted by AAAI 201

    Combination of Domain Knowledge and Deep Learning for Sentiment Analysis of Short and Informal Messages on Social Media

    Full text link
    Sentiment analysis has been emerging recently as one of the major natural language processing (NLP) tasks in many applications. Especially, as social media channels (e.g. social networks or forums) have become significant sources for brands to observe user opinions about their products, this task is thus increasingly crucial. However, when applied with real data obtained from social media, we notice that there is a high volume of short and informal messages posted by users on those channels. This kind of data makes the existing works suffer from many difficulties to handle, especially ones using deep learning approaches. In this paper, we propose an approach to handle this problem. This work is extended from our previous work, in which we proposed to combine the typical deep learning technique of Convolutional Neural Networks with domain knowledge. The combination is used for acquiring additional training data augmentation and a more reasonable loss function. In this work, we further improve our architecture by various substantial enhancements, including negation-based data augmentation, transfer learning for word embeddings, the combination of word-level embeddings and character-level embeddings, and using multitask learning technique for attaching domain knowledge rules in the learning process. Those enhancements, specifically aiming to handle short and informal messages, help us to enjoy significant improvement in performance once experimenting on real datasets.Comment: A Preprint of an article accepted for publication by Inderscience in IJCVR on September 201
    • …
    corecore