2 research outputs found
PADA: A Prompt-based Autoregressive Approach for Adaptation to Unseen Domains
Natural Language Processing algorithms have made incredible progress
recently, but they still struggle when applied to out-of-distribution examples.
In this paper, we address a very challenging and previously underexplored
version of this domain adaptation problem. In our setup an algorithm is trained
on several source domains, and then applied to examples from an unseen domain
that is unknown at training time. Particularly, no examples, labeled or
unlabeled, or any other knowledge about the target domain are available to the
algorithm at training time. We present PADA: A Prompt-based Autoregressive
Domain Adaptation algorithm, based on the T5 model. Given a test example, PADA
first generates a unique prompt and then, conditioned on this prompt, labels
the example with respect to the NLP task. The prompt is a sequence of
unrestricted length, consisting of pre-defined Domain Related Features (DRFs)
that characterize each of the source domains. Intuitively, the prompt is a
unique signature that maps the test example to the semantic space spanned by
the source domains. In experiments with two tasks: Rumour Detection and
Multi-Genre Natural Language Inference (MNLI), for a total of 10 multi-source
adaptation scenarios, PADA strongly outperforms state-of-the-art approaches and
additional strong baselines.Comment: First two authors contributed equally to this work. Our code and data
are available at: https://github.com/eyalbd2/PAD
Sentiment Analysis of German Twitter
This thesis explores the ways by how people express their opinions on German
Twitter, examines current approaches to automatic mining of these feelings, and
proposes novel methods, which outperform state-of-the-art techniques. For this
purpose, I introduce a new corpus of German tweets that have been manually
annotated with sentiments, their targets and holders, as well as polar terms
and their contextual modifiers. Using these data, I explore four major areas of
sentiment research: (i) generation of sentiment lexicons, (ii) fine-grained
opinion mining, (iii) message-level polarity classification, and (iv)
discourse-aware sentiment analysis. In the first task, I compare three popular
groups of lexicon generation methods: dictionary-, corpus-, and
word-embedding-based ones, finding that dictionary-based systems generally
yield better lexicons than the last two groups. Apart from this, I propose a
linear projection algorithm, whose results surpass many existing automatic
lexicons. Afterwords, in the second task, I examine two common approaches to
automatic prediction of sentiments, sources, and targets: conditional random
fields and recurrent neural networks, obtaining higher scores with the former
model and improving these results even further by redefining the structure of
CRF graphs. When dealing with message-level polarity classification, I
juxtapose three major sentiment paradigms: lexicon-, machine-learning-, and
deep-learning-based systems, and try to unite the first and last of these
groups by introducing a bidirectional neural network with lexicon-based
attention. Finally, in order to make the new classifier aware of discourse
structure, I let it separately analyze the elementary discourse units of each
microblog and infer the overall polarity of a message from the scores of its
EDUs with the help of two new approaches: latent-marginalized CRFs and
Recursive Dirichlet Process.Comment: Ph.D. Dissertatio