563 research outputs found
Open event extraction from online text using a generative adversarial network
To extract the structured representations of open-domain events, Bayesian graphical models have made some progress. However, these approaches typically assume that all words in a document are generated from a single event. While this may be true for short text such as tweets, such an assumption does not generally hold for long text such as news articles. Moreover, Bayesian graphical models often rely on Gibbs sampling for parameter inference which may take long time to converge. To address these limitations, we propose an event extraction model based on Generative Adversarial Nets, called Adversarial-neural Event Model (AEM). AEM models an event with a Dirichlet prior and uses a generator network to capture the patterns underlying latent events. A discriminator is used to distinguish documents reconstructed from the latent events and the original documents. A byproduct of the discriminator is that the features generated by the learned discriminator network allow the visualization of the extracted events. Our model has been evaluated on two Twitter datasets and a news article dataset. Experimental results show that our model outperforms the baseline approaches on all the datasets, with more significant improvements observed on the news article dataset where an increase of 15\% is observed in F-measure
Doc2EDAG: An End-to-End Document-level Framework for Chinese Financial Event Extraction
Most existing event extraction (EE) methods merely extract event arguments
within the sentence scope. However, such sentence-level EE methods struggle to
handle soaring amounts of documents from emerging applications, such as
finance, legislation, health, etc., where event arguments always scatter across
different sentences, and even multiple such event mentions frequently co-exist
in the same document. To address these challenges, we propose a novel
end-to-end model, Doc2EDAG, which can generate an entity-based directed acyclic
graph to fulfill the document-level EE (DEE) effectively. Moreover, we
reformalize a DEE task with the no-trigger-words design to ease the
document-level event labeling. To demonstrate the effectiveness of Doc2EDAG, we
build a large-scale real-world dataset consisting of Chinese financial
announcements with the challenges mentioned above. Extensive experiments with
comprehensive analyses illustrate the superiority of Doc2EDAG over
state-of-the-art methods. Data and codes can be found at
https://github.com/dolphin-zs/Doc2EDAG.Comment: Accepted by EMNLP 201
Learning from Very Few Samples: A Survey
Few sample learning (FSL) is significant and challenging in the field of
machine learning. The capability of learning and generalizing from very few
samples successfully is a noticeable demarcation separating artificial
intelligence and human intelligence since humans can readily establish their
cognition to novelty from just a single or a handful of examples whereas
machine learning algorithms typically entail hundreds or thousands of
supervised samples to guarantee generalization ability. Despite the long
history dated back to the early 2000s and the widespread attention in recent
years with booming deep learning technologies, little surveys or reviews for
FSL are available until now. In this context, we extensively review 300+ papers
of FSL spanning from the 2000s to 2019 and provide a timely and comprehensive
survey for FSL. In this survey, we review the evolution history as well as the
current progress on FSL, categorize FSL approaches into the generative model
based and discriminative model based kinds in principle, and emphasize
particularly on the meta learning based FSL approaches. We also summarize
several recently emerging extensional topics of FSL and review the latest
advances on these topics. Furthermore, we highlight the important FSL
applications covering many research hotspots in computer vision, natural
language processing, audio and speech, reinforcement learning and robotic, data
analysis, etc. Finally, we conclude the survey with a discussion on promising
trends in the hope of providing guidance and insights to follow-up researches.Comment: 30 page
MLBiNet: A Cross-Sentence Collective Event Detection Network
We consider the problem of collectively detecting multiple events,
particularly in cross-sentence settings. The key to dealing with the problem is
to encode semantic information and model event inter-dependency at a
document-level. In this paper, we reformulate it as a Seq2Seq task and propose
a Multi-Layer Bidirectional Network (MLBiNet) to capture the document-level
association of events and semantic information simultaneously. Specifically, a
bidirectional decoder is firstly devised to model event inter-dependency within
a sentence when decoding the event tag vector sequence. Secondly, an
information aggregation module is employed to aggregate sentence-level semantic
and event tag information. Finally, we stack multiple bidirectional decoders
and feed cross-sentence information, forming a multi-layer bidirectional
tagging architecture to iteratively propagate information across sentences. We
show that our approach provides significant improvement in performance compared
to the current state-of-the-art results.Comment: Accepted by ACL 202
Reinforced Imitative Graph Learning for Mobile User Profiling
Mobile user profiling refers to the efforts of extracting users’ characteristics from mobile activities. In order to capture the dynamic varying of user characteristics for generating effective user profiling, we propose an imitation-based mobile user profiling framework. Considering the objective of teaching an autonomous agent to imitate user mobility based on the user’s profile, the user profile is the most accurate when the agent can perfectly mimic the user behavior patterns. The profiling framework is formulated into a reinforcement learning task, where an agent is a next-visit planner, an action is a POI that a user will visit next, and the state of the environment is a fused representation of a user and spatial entities. An event in which a user visits a POI will construct a new state, which helps the agent predict users’ mobility more accurately. In the framework, we introduce a spatial Knowledge Graph (KG) to characterize the semantics of user visits over connected spatial entities. Additionally, we develop a mutual-updating strategy to quantify the state that evolves over time. Along these lines, we develop a reinforcement imitative graph learning framework for mobile user profiling. Finally, we conduct extensive experiments to demonstrate the superiority of our approach
PESE: Event Structure Extraction using Pointer Network based Encoder-Decoder Architecture
The task of event extraction (EE) aims to find the events and event-related
argument information from the text and represent them in a structured format.
Most previous works try to solve the problem by separately identifying multiple
substructures and aggregating them to get the complete event structure. The
problem with the methods is that it fails to identify all the interdependencies
among the event participants (event-triggers, arguments, and roles). In this
paper, we represent each event record in a unique tuple format that contains
trigger phrase, trigger type, argument phrase, and corresponding role
information. Our proposed pointer network-based encoder-decoder model generates
an event tuple in each time step by exploiting the interactions among event
participants and presenting a truly end-to-end solution to the EE task. We
evaluate our model on the ACE2005 dataset, and experimental results demonstrate
the effectiveness of our model by achieving competitive performance compared to
the state-of-the-art methods
- …