960 research outputs found

    Active Discriminative Text Representation Learning

    Full text link
    We propose a new active learning (AL) method for text classification with convolutional neural networks (CNNs). In AL, one selects the instances to be manually labeled with the aim of maximizing model performance with minimal effort. Neural models capitalize on word embeddings as representations (features), tuning these to the task at hand. We argue that AL strategies for multi-layered neural models should focus on selecting instances that most affect the embedding space (i.e., induce discriminative word representations). This is in contrast to traditional AL approaches (e.g., entropy-based uncertainty sampling), which specify higher level objectives. We propose a simple approach for sentence classification that selects instances containing words whose embeddings are likely to be updated with the greatest magnitude, thereby rapidly learning discriminative, task-specific embeddings. We extend this approach to document classification by jointly considering: (1) the expected changes to the constituent word representations; and (2) the model's current overall uncertainty regarding the instance. The relative emphasis placed on these criteria is governed by a stochastic process that favors selecting instances likely to improve representations at the outset of learning, and then shifts toward general uncertainty sampling as AL progresses. Empirical results show that our method outperforms baseline AL approaches on both sentence and document classification tasks. We also show that, as expected, the method quickly learns discriminative word embeddings. To the best of our knowledge, this is the first work on AL addressing neural models for text classification.Comment: This paper got accepted by AAAI 201

    Using Word Embeddings in Twitter Election Classification

    Get PDF
    Word embeddings and convolutional neural networks (CNN) have attracted extensive attention in various classification tasks for Twitter, e.g. sentiment classification. However, the effect of the configuration used to train and generate the word embeddings on the classification performance has not been studied in the existing literature. In this paper, using a Twitter election classification task that aims to detect election-related tweets, we investigate the impact of the background dataset used to train the embedding models, the context window size and the dimensionality of word embeddings on the classification performance. By comparing the classification results of two word embedding models, which are trained using different background corpora (e.g. Wikipedia articles and Twitter microposts), we show that the background data type should align with the Twitter classification dataset to achieve a better performance. Moreover, by evaluating the results of word embeddings models trained using various context window sizes and dimensionalities, we found that large context window and dimension sizes are preferable to improve the performance. Our experimental results also show that using word embeddings and CNN leads to statistically significant improvements over various baselines such as random, SVM with TF-IDF and SVM with word embeddings

    Adaptive Semi-supervised Learning for Cross-domain Sentiment Classification

    Full text link
    We consider the cross-domain sentiment classification problem, where a sentiment classifier is to be learned from a source domain and to be generalized to a target domain. Our approach explicitly minimizes the distance between the source and the target instances in an embedded feature space. With the difference between source and target minimized, we then exploit additional information from the target domain by consolidating the idea of semi-supervised learning, for which, we jointly employ two regularizations -- entropy minimization and self-ensemble bootstrapping -- to incorporate the unlabeled target data for classifier refinement. Our experimental results demonstrate that the proposed approach can better leverage unlabeled data from the target domain and achieve substantial improvements over baseline methods in various experimental settings.Comment: Accepted to EMNLP201

    SemAxis: A Lightweight Framework to Characterize Domain-Specific Word Semantics Beyond Sentiment

    Full text link
    Because word semantics can substantially change across communities and contexts, capturing domain-specific word semantics is an important challenge. Here, we propose SEMAXIS, a simple yet powerful framework to characterize word semantics using many semantic axes in word- vector spaces beyond sentiment. We demonstrate that SEMAXIS can capture nuanced semantic representations in multiple online communities. We also show that, when the sentiment axis is examined, SEMAXIS outperforms the state-of-the-art approaches in building domain-specific sentiment lexicons.Comment: Accepted in ACL 2018 as a full pape
    • …
    corecore