41,672 research outputs found
A Unified Model for Opinion Target Extraction and Target Sentiment Prediction
Target-based sentiment analysis involves opinion target extraction and target
sentiment classification. However, most of the existing works usually studied
one of these two sub-tasks alone, which hinders their practical use. This paper
aims to solve the complete task of target-based sentiment analysis in an
end-to-end fashion, and presents a novel unified model which applies a unified
tagging scheme. Our framework involves two stacked recurrent neural networks:
The upper one predicts the unified tags to produce the final output results of
the primary target-based sentiment analysis; The lower one performs an
auxiliary target boundary prediction aiming at guiding the upper network to
improve the performance of the primary task. To explore the inter-task
dependency, we propose to explicitly model the constrained transitions from
target boundaries to target sentiment polarities. We also propose to maintain
the sentiment consistency within an opinion target via a gate mechanism which
models the relation between the features for the current word and the previous
word. We conduct extensive experiments on three benchmark datasets and our
framework achieves consistently superior results.Comment: AAAI 201
A Multi-modal Approach to Fine-grained Opinion Mining on Video Reviews
Despite the recent advances in opinion mining for written reviews, few works
have tackled the problem on other sources of reviews. In light of this issue,
we propose a multi-modal approach for mining fine-grained opinions from video
reviews that is able to determine the aspects of the item under review that are
being discussed and the sentiment orientation towards them. Our approach works
at the sentence level without the need for time annotations and uses features
derived from the audio, video and language transcriptions of its contents. We
evaluate our approach on two datasets and show that leveraging the video and
audio modalities consistently provides increased performance over text-only
baselines, providing evidence these extra modalities are key in better
understanding video reviews.Comment: Second Grand Challenge and Workshop on Multimodal Language ACL 202
Interpretable Architectures and Algorithms for Natural Language Processing
Paper V is excluded from the dissertation with respect to copyright.This thesis has two parts: Firstly, we introduce the human level-interpretable models using Tsetlin Machine (TM) for NLP tasks. Secondly, we present an interpretable model using DNNs. The first part combines several architectures of various NLP tasks using TM along with its robustness. We use this model to propose logic-based text classification. We start with basic Word Sense Disambiguation (WSD), where we employ TM to design novel interpretation techniques using the frequency of words in the clause. We then tackle a new problem in NLP, i.e., aspect-based text classification using a novel feature engineering for TM. Since TM operates on Boolean features, it relies on Bag-of-Words (BOW), making it difficult to use pre-trained word embedding like Glove, word2vec, and fasttext. Hence, we designed a Glove embedded TM to significantly enhance the model’s performance. In addition to this, NLP models are sensitive to distribution bias because of spurious correlations. Hence we employ TM to design a robust text classification against spurious correlations.
The second part of the thesis consists interpretable model using DNN where we design a simple solution for complex position dependent NLP task. Since TM’s interpretability comes with the cost of performance, we propose an DNN-based architecture using a masking scheme on LSTM/GRU based models that ease the interpretation for humans using the attention mechanism. At last, we take the advantages of both models and design an ensemble model by integrating TM’s interpretable information into DNN for better visualization of attention weights.
Our proposed model can be efficiently integrated to have a fully explainable model for NLP that assists trustable AI. Overall, our model shows excellent results and interpretation in several open-sourced NLP datasets. Thus, we believe that by combining the novel interpretation of TM, the masking technique in the neural network, and the integrated ensemble model, we can build a simple yet effective platform for explainable NLP applications wherever necessary.publishedVersio
Multi-Zone Unit for Recurrent Neural Networks
Recurrent neural networks (RNNs) have been widely used to deal with sequence
learning problems. The input-dependent transition function, which folds new
observations into hidden states to sequentially construct fixed-length
representations of arbitrary-length sequences, plays a critical role in RNNs.
Based on single space composition, transition functions in existing RNNs often
have difficulty in capturing complicated long-range dependencies. In this
paper, we introduce a new Multi-zone Unit (MZU) for RNNs. The key idea is to
design a transition function that is capable of modeling multiple space
composition. The MZU consists of three components: zone generation, zone
composition, and zone aggregation. Experimental results on multiple datasets of
the character-level language modeling task and the aspect-based sentiment
analysis task demonstrate the superiority of the MZU.Comment: Accepted at AAAI 202
AX-MABSA: A Framework for Extremely Weakly Supervised Multi-label Aspect Based Sentiment Analysis
Aspect Based Sentiment Analysis is a dominant research area with potential
applications in social media analytics, business, finance, and health. Prior
works in this area are primarily based on supervised methods, with a few
techniques using weak supervision limited to predicting a single aspect
category per review sentence. In this paper, we present an extremely weakly
supervised multi-label Aspect Category Sentiment Analysis framework which does
not use any labelled data. We only rely on a single word per class as an
initial indicative information. We further propose an automatic word selection
technique to choose these seed categories and sentiment words. We explore
unsupervised language model post-training to improve the overall performance,
and propose a multi-label generator model to generate multiple aspect
category-sentiment pairs per review sentence. Experiments conducted on four
benchmark datasets showcase our method to outperform other weakly supervised
baselines by a significant margin.Comment: to be published in EMNLP 202
A Global Context Mechanism for Sequence Labeling
Sequential labeling tasks necessitate the computation of sentence
representations for each word within a given sentence. With the advent of
advanced pretrained language models; one common approach involves incorporating
a BiLSTM layer to bolster the sequence structure information at the output
level. Nevertheless, it has been empirically demonstrated (P.-H. Li et al.,
2020) that the potential of BiLSTM for generating sentence representations for
sequence labeling tasks is constrained, primarily due to the amalgamation of
fragments form past and future sentence representations to form a complete
sentence representation. In this study, we discovered that strategically
integrating the whole sentence representation, which existing in the first cell
and last cell of BiLSTM, into sentence representation of ecah cell, could
markedly enhance the F1 score and accuracy. Using BERT embedded within BiLSTM
as illustration, we conducted exhaustive experiments on nine datasets for
sequence labeling tasks, encompassing named entity recognition (NER), part of
speech (POS) tagging and End-to-End Aspect-Based sentiment analysis (E2E-ABSA).
We noted significant improvements in F1 scores and accuracy across all examined
datasets
Octa: Omissions and Conflicts in Target-Aspect Sentiment Analysis
Sentiments in opinionated text are often determined by both aspects and
target words (or targets). We observe that targets and aspects interrelate in
subtle ways, often yielding conflicting sentiments. Thus, a naive aggregation
of sentiments from aspects and targets treated separately, as in existing
sentiment analysis models, impairs performance.
We propose Octa, an approach that jointly considers aspects and targets when
inferring sentiments. To capture and quantify relationships between targets and
context words, Octa uses a selective self-attention mechanism that handles
implicit or missing targets. Specifically, Octa involves two layers of
attention mechanisms for, respectively, selective attention between targets and
context words and attention over words based on aspects. On benchmark datasets,
Octa outperforms leading models by a large margin, yielding (absolute) gains in
accuracy of 1.6% to 4.3%.Comment: Accepted by Findings of EMNLP 202
- …