2,088 research outputs found
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate Label Spaces
We combine multi-task learning and semi-supervised learning by inducing a
joint embedding space between disparate label spaces and learning transfer
functions between label embeddings, enabling us to jointly leverage unlabelled
data and auxiliary, annotated datasets. We evaluate our approach on a variety
of sequence classification tasks with disparate label spaces. We outperform
strong single and multi-task baselines and achieve a new state-of-the-art for
topic-based sentiment analysis.Comment: To appear at NAACL 2018 (long
Automatically extracting polarity-bearing topics for cross-domain sentiment classification
Joint sentiment-topic (JST) model was previously proposed to detect sentiment and topic simultaneously from text. The only supervision required by JST model learning is domain-independent polarity word priors. In this paper, we modify the JST model by incorporating word polarity priors through modifying the topic-word Dirichlet priors. We study the polarity-bearing topics extracted by JST and show that by augmenting the original feature space with polarity-bearing topics, the in-domain supervised classifiers learned from augmented feature representation achieve the state-of-the-art performance of 95% on the movie review data and an average of 90% on the multi-domain sentiment dataset. Furthermore, using feature augmentation and selection according to the information gain criteria for cross-domain sentiment classification, our proposed approach performs either better or comparably compared to previous approaches. Nevertheless, our approach is much simpler and does not require difficult parameter tuning
A Survey of Multi-task Learning in Natural Language Processing: Regarding Task Relatedness and Training Methods
Multi-task learning (MTL) has become increasingly popular in natural language
processing (NLP) because it improves the performance of related tasks by
exploiting their commonalities and differences. Nevertheless, it is still not
understood very well how multi-task learning can be implemented based on the
relatedness of training tasks. In this survey, we review recent advances of
multi-task learning methods in NLP, with the aim of summarizing them into two
general multi-task training methods based on their task relatedness: (i) joint
training and (ii) multi-step training. We present examples in various NLP
downstream applications, summarize the task relationships and discuss future
directions of this promising topic.Comment: Accepted to EACL 2023 as regular long pape
Adversarial Training in Affective Computing and Sentiment Analysis: Recent Advances and Perspectives
Over the past few years, adversarial training has become an extremely active
research topic and has been successfully applied to various Artificial
Intelligence (AI) domains. As a potentially crucial technique for the
development of the next generation of emotional AI systems, we herein provide a
comprehensive overview of the application of adversarial training to affective
computing and sentiment analysis. Various representative adversarial training
algorithms are explained and discussed accordingly, aimed at tackling diverse
challenges associated with emotional AI systems. Further, we highlight a range
of potential future research directions. We expect that this overview will help
facilitate the development of adversarial training for affective computing and
sentiment analysis in both the academic and industrial communities
Combination of Domain Knowledge and Deep Learning for Sentiment Analysis of Short and Informal Messages on Social Media
Sentiment analysis has been emerging recently as one of the major natural
language processing (NLP) tasks in many applications. Especially, as social
media channels (e.g. social networks or forums) have become significant sources
for brands to observe user opinions about their products, this task is thus
increasingly crucial. However, when applied with real data obtained from social
media, we notice that there is a high volume of short and informal messages
posted by users on those channels. This kind of data makes the existing works
suffer from many difficulties to handle, especially ones using deep learning
approaches. In this paper, we propose an approach to handle this problem. This
work is extended from our previous work, in which we proposed to combine the
typical deep learning technique of Convolutional Neural Networks with domain
knowledge. The combination is used for acquiring additional training data
augmentation and a more reasonable loss function. In this work, we further
improve our architecture by various substantial enhancements, including
negation-based data augmentation, transfer learning for word embeddings, the
combination of word-level embeddings and character-level embeddings, and using
multitask learning technique for attaching domain knowledge rules in the
learning process. Those enhancements, specifically aiming to handle short and
informal messages, help us to enjoy significant improvement in performance once
experimenting on real datasets.Comment: A Preprint of an article accepted for publication by Inderscience in
IJCVR on September 201
Recommended from our members
Neural Methods for Answer Passage Retrieval over Sparse Collections
Recent advances in machine learning have allowed information retrieval (IR) techniques to advance beyond the stage of handcrafting domain specific features. Specifically, deep neural models incorporate varying levels of features to learn whether a document answers the information need of a query. However, these neural models rely on a large number of parameters to successfully learn a relation between a query and a relevant document. This reliance on a large number of parameters, combined with the current methods of optimization relying on small updates necessitates numerous samples to allow the neural model to converge on an effective relevance function. This presents a significant obstacle in the realm of IR as relevance judgements are often sparse or noisy and combined with a large class imbalance. This is especially true for short text retrieval where there is often only one relevant passage. This problem is exacerbated when training these artificial neural networks, as excessive negative sampling can result in poor performance. Thus, we propose approaching this task through multiple avenues and examining their effectiveness on a non-factoid question answering (QA) task.We first propose learning local embeddings specific to the relevance information of the collection to improve performance of an upstream neural model. In doing so, we find significantly improved results over standard pre-trained embeddings, despite only developing the embeddings on a small collection which would not be sufficient for a full language model. Leveraging this local representation, and inspired by recent work in machine translation, we introduce a hybrid embedding based model that incorporates both pre-trained embeddings while dynamically constructing local representations from character embeddings. The hybrid approach relies on pre-trained embeddings to achieve an effective retrieval model, and continually adjusts its character level abstraction to fit a local representation.We next approach methods to adapt neural models to multiple IR collections, therefore reducing the collection specific training required and alleviating the need to retrain a neural model\u27s parameters for a new subdomain of a collection. First, we propose an adversarial retrieval model which achieves state-of-the-art performance on out of subdomain queries while maintaining in-domain performance. Second, we establish an informed negative sampling approach using a reinforcement learning agent. The agent is trained to directly maximize the performance of a neural IR model using a predefined IR metric by choosing which ranking function from which to sample negative documents. This policy based sampling allows the neural model to be exposed to more of a collection and results in a more consistent neural retrieval model over multiple training instances. Lastly, we move towards a universal retrieval function. We initially introduce a probe-based inspection of neural relevance models through the lens of standard natural language processing tasks and establish that while seemingly similar QA collections require the same basic abstract information, the final layers that determine relevance differ significantly. We then introduce Universal Retrieval Functions, a method to incorporate new collections using a library of previously trained linear relevance models and a common neural representation
- …