311 research outputs found
"i have a feeling trump will win..................": Forecasting Winners and Losers from User Predictions on Twitter
Social media users often make explicit predictions about upcoming events.
Such statements vary in the degree of certainty the author expresses toward the
outcome:"Leonardo DiCaprio will win Best Actor" vs. "Leonardo DiCaprio may win"
or "No way Leonardo wins!". Can popular beliefs on social media predict who
will win? To answer this question, we build a corpus of tweets annotated for
veridicality on which we train a log-linear classifier that detects positive
veridicality with high precision. We then forecast uncertain outcomes using the
wisdom of crowds, by aggregating users' explicit predictions. Our method for
forecasting winners is fully automated, relying only on a set of contenders as
input. It requires no training data of past outcomes and outperforms sentiment
and tweet volume baselines on a broad range of contest prediction tasks. We
further demonstrate how our approach can be used to measure the reliability of
individual accounts' predictions and retrospectively identify surprise
outcomes.Comment: Accepted at EMNLP 2017 (long paper
An Annotated Corpus for Machine Reading of Instructions in Wet Lab Protocols
We describe an effort to annotate a corpus of natural language instructions
consisting of 622 wet lab protocols to facilitate automatic or semi-automatic
conversion of protocols into a machine-readable format and benefit biological
research. Experimental results demonstrate the utility of our corpus for
developing machine learning approaches to shallow semantic parsing of
instructional texts. We make our annotated Wet Lab Protocol Corpus available to
the research community
Adversarial Learning for Neural Dialogue Generation
In this paper, drawing intuition from the Turing test, we propose using
adversarial training for open-domain dialogue generation: the system is trained
to produce sequences that are indistinguishable from human-generated dialogue
utterances. We cast the task as a reinforcement learning (RL) problem where we
jointly train two systems, a generative model to produce response sequences,
and a discriminator---analagous to the human evaluator in the Turing test--- to
distinguish between the human-generated dialogues and the machine-generated
ones. The outputs from the discriminator are then used as rewards for the
generative model, pushing the system to generate dialogues that mostly resemble
human dialogues.
In addition to adversarial training we describe a model for adversarial {\em
evaluation} that uses success in fooling an adversary as a dialogue evaluation
metric, while avoiding a number of potential pitfalls. Experimental results on
several metrics, including adversarial evaluation, demonstrate that the
adversarially-trained system generates higher-quality responses than previous
baselines
Deep Reinforcement Learning for Dialogue Generation
Recent neural models of dialogue generation offer great promise for
generating responses for conversational agents, but tend to be shortsighted,
predicting utterances one at a time while ignoring their influence on future
outcomes. Modeling the future direction of a dialogue is crucial to generating
coherent, interesting dialogues, a need which led traditional NLP models of
dialogue to draw on reinforcement learning. In this paper, we show how to
integrate these goals, applying deep reinforcement learning to model future
reward in chatbot dialogue. The model simulates dialogues between two virtual
agents, using policy gradient methods to reward sequences that display three
useful conversational properties: informativity (non-repetitive turns),
coherence, and ease of answering (related to forward-looking function). We
evaluate our model on diversity, length as well as with human judges, showing
that the proposed algorithm generates more interactive responses and manages to
foster a more sustained conversation in dialogue simulation. This work marks a
first step towards learning a neural conversational model based on the
long-term success of dialogues
Are Large Language Models Robust Coreference Resolvers?
Recent work on extending coreference resolution across domains and languages
relies on annotated data in both the target domain and language. At the same
time, pre-trained large language models (LMs) have been reported to exhibit
strong zero- and few-shot learning abilities across a wide range of NLP tasks.
However, prior work mostly studied this ability using artificial sentence-level
datasets such as the Winograd Schema Challenge. In this paper, we assess the
feasibility of prompt-based coreference resolution by evaluating
instruction-tuned language models on difficult, linguistically-complex
coreference benchmarks (e.g., CoNLL-2012). We show that prompting for
coreference can outperform current unsupervised coreference systems, although
this approach appears to be reliant on high-quality mention detectors. Further
investigations reveal that instruction-tuned LMs generalize surprisingly well
across domains, languages, and time periods; yet continued fine-tuning of
neural models should still be preferred if small amounts of annotated examples
are available
- …