3,994 research outputs found
A Generative Approach for Script Event Prediction via Contrastive Fine-tuning
Script event prediction aims to predict the subsequent event given the
context. This requires the capability to infer the correlations between events.
Recent works have attempted to improve event correlation reasoning by using
pretrained language models and incorporating external knowledge~(e.g.,
discourse relations). Though promising results have been achieved, some
challenges still remain. First, the pretrained language models adopted by
current works ignore event-level knowledge, resulting in an inability to
capture the correlations between events well. Second, modeling correlations
between events with discourse relations is limited because it can only capture
explicit correlations between events with discourse markers, and cannot capture
many implicit correlations. To this end, we propose a novel generative approach
for this task, in which a pretrained language model is fine-tuned with an
event-centric pretraining objective and predicts the next event within a
generative paradigm. Specifically, we first introduce a novel event-level blank
infilling strategy as the learning objective to inject event-level knowledge
into the pretrained language model, and then design a likelihood-based
contrastive loss for fine-tuning the generative model. Instead of using an
additional prediction layer, we perform prediction by using sequence
likelihoods generated by the generative model. Our approach models correlations
between events in a soft way without any external knowledge. The
likelihood-based prediction eliminates the need to use additional networks to
make predictions and is somewhat interpretable since it scores each word in the
event. Experimental results on the multi-choice narrative cloze~(MCNC) task
demonstrate that our approach achieves better results than other
state-of-the-art baselines. Our code will be available at
https://github.com/zhufq00/mcnc
Session-based Recommendation with Graph Neural Networks
The problem of session-based recommendation aims to predict user actions
based on anonymous sessions. Previous methods model a session as a sequence and
estimate user representations besides item representations to make
recommendations. Though achieved promising results, they are insufficient to
obtain accurate user vectors in sessions and neglect complex transitions of
items. To obtain accurate item embedding and take complex transitions of items
into account, we propose a novel method, i.e. Session-based Recommendation with
Graph Neural Networks, SR-GNN for brevity. In the proposed method, session
sequences are modeled as graph-structured data. Based on the session graph, GNN
can capture complex transitions of items, which are difficult to be revealed by
previous conventional sequential methods. Each session is then represented as
the composition of the global preference and the current interest of that
session using an attention network. Extensive experiments conducted on two real
datasets show that SR-GNN evidently outperforms the state-of-the-art
session-based recommendation methods consistently.Comment: 9 pages, 4 figures, accepted by AAAI Conference on Artificial
Intelligence (AAAI-19
Paragraph-level Commonsense Transformers with Recurrent Memory
Human understanding of narrative texts requires making commonsense inferences
beyond what is stated explicitly in the text. A recent model, COMET, can
generate such implicit commonsense inferences along several dimensions such as
pre- and post-conditions, motivations, and mental states of the participants.
However, COMET was trained on commonsense inferences of short phrases, and is
therefore discourse-agnostic. When presented with each sentence of a
multi-sentence narrative, it might generate inferences that are inconsistent
with the rest of the narrative.
We present the task of discourse-aware commonsense inference. Given a
sentence within a narrative, the goal is to generate commonsense inferences
along predefined dimensions, while maintaining coherence with the rest of the
narrative. Such large-scale paragraph-level annotation is hard to get and
costly, so we use available sentence-level annotations to efficiently and
automatically construct a distantly supervised corpus.
Using this corpus, we train PARA-COMET, a discourse-aware model that
incorporates paragraph-level information to generate coherent commonsense
inferences from narratives. PARA-COMET captures both semantic knowledge
pertaining to prior world knowledge, and episodic knowledge involving how
current events relate to prior and future events in a narrative. Our results
show that PARA-COMET outperforms the sentence-level baselines, particularly in
generating inferences that are both coherent and novel.Comment: AAAI 202
MAVEN-Arg: Completing the Puzzle of All-in-One Event Understanding Dataset with Event Argument Annotation
Understanding events in texts is a core objective of natural language
understanding, which requires detecting event occurrences, extracting event
arguments, and analyzing inter-event relationships. However, due to the
annotation challenges brought by task complexity, a large-scale dataset
covering the full process of event understanding has long been absent. In this
paper, we introduce MAVEN-Arg, which augments MAVEN datasets with event
argument annotations, making the first all-in-one dataset supporting event
detection, event argument extraction (EAE), and event relation extraction. As
an EAE benchmark, MAVEN-Arg offers three main advantages: (1) a comprehensive
schema covering 162 event types and 612 argument roles, all with expert-written
definitions and examples; (2) a large data scale, containing 98,591 events and
290,613 arguments obtained with laborious human annotation; (3) the exhaustive
annotation supporting all task variants of EAE, which annotates both entity and
non-entity event arguments in document level. Experiments indicate that
MAVEN-Arg is quite challenging for both fine-tuned EAE models and proprietary
large language models (LLMs). Furthermore, to demonstrate the benefits of an
all-in-one dataset, we preliminarily explore a potential application, future
event prediction, with LLMs. MAVEN-Arg and our code can be obtained from
https://github.com/THU-KEG/MAVEN-Argument.Comment: Working in progres
- …