136 research outputs found
Pairwise Representation Learning for Event Coreference
Natural Language Processing tasks such as resolving the coreference of events
require understanding the relations between two text snippets. These tasks are
typically formulated as (binary) classification problems over independently
induced representations of the text snippets. In this work, we develop a
Pairwise Representation Learning (PairwiseRL) scheme for the event mention
pairs, in which we jointly encode a pair of text snippets so that the
representation of each mention in the pair is induced in the context of the
other one. Furthermore, our representation supports a finer, structured
representation of the text snippet to facilitate encoding events and their
arguments. We show that PairwiseRL, despite its simplicity, outperforms the
prior state-of-the-art event coreference systems on both cross-document and
within-document event coreference benchmarks. We also conduct in-depth analysis
in terms of the improvement and the limitation of pairwise representation so as
to provide insights for future work.Comment: 8 pages. *SEM 202
Improved Neural Relation Detection for Knowledge Base Question Answering
Relation detection is a core component for many NLP applications including
Knowledge Base Question Answering (KBQA). In this paper, we propose a
hierarchical recurrent neural network enhanced by residual learning that
detects KB relations given an input question. Our method uses deep residual
bidirectional LSTMs to compare questions and relation names via different
hierarchies of abstraction. Additionally, we propose a simple KBQA system that
integrates entity linking and our proposed relation detector to enable one
enhance another. Experimental results evidence that our approach achieves not
only outstanding relation detection performance, but more importantly, it helps
our KBQA system to achieve state-of-the-art accuracy for both single-relation
(SimpleQuestions) and multi-relation (WebQSP) QA benchmarks.Comment: Accepted by ACL 2017 (updated for camera-ready
Event Linking: Grounding Event Mentions to Wikipedia
Comprehending an article requires understanding its constituent events.
However, the context where an event is mentioned often lacks the details of
this event. A question arises: how can the reader obtain more knowledge about
this particular event in addition to what is provided by the local context in
the article?
This work defines Event Linking, a new natural language understanding task at
the event level. Event linking tries to link an event mention appearing in an
article to the most appropriate Wikipedia page. This page is expected to
provide rich knowledge about what the event mention refers to. To standardize
the research in this new direction, we contribute in four-fold. First, this is
the first work in the community that formally defines Event Linking task.
Second, we collect a dataset for this new task. Specifically, we automatically
gather training set from Wikipedia, and then create two evaluation sets: one
from the Wikipedia domain, reporting the in-domain performance, and a second
from the real-world news domain, to evaluate out-of-domain performance. Third,
we retrain and evaluate two state-of-the-art (SOTA) entity linking models,
showing the challenges of event linking, and we propose an event-specific
linking system EVELINK to set a competitive result for the new task. Fourth, we
conduct a detailed and insightful analysis to help understand the task and the
limitation of the current model. Overall, as our analysis shows, Event Linking
is a considerably challenging and essential task requiring more effort from the
community. Data and code are available here:
https://github.com/CogComp/event-linking.Comment: 9 pages, 9 tables, 1 figur
Tri-Attention: Explicit Context-Aware Attention Mechanism for Natural Language Processing
In natural language processing (NLP), the context of a word or sentence plays
an essential role. Contextual information such as the semantic representation
of a passage or historical dialogue forms an essential part of a conversation
and a precise understanding of the present phrase or sentence. However, the
standard attention mechanisms typically generate weights using query and key
but ignore context, forming a Bi-Attention framework, despite their great
success in modeling sequence alignment. This Bi-Attention mechanism does not
explicitly model the interactions between the contexts, queries and keys of
target sequences, missing important contextual information and resulting in
poor attention performance. Accordingly, a novel and general triple-attention
(Tri-Attention) framework expands the standard Bi-Attention mechanism and
explicitly interacts query, key, and context by incorporating context as the
third dimension in calculating relevance scores. Four variants of Tri-Attention
are generated by expanding the two-dimensional vector-based additive,
dot-product, scaled dot-product, and bilinear operations in Bi-Attention to the
tensor operations for Tri-Attention. Extensive experiments on three NLP tasks
demonstrate that Tri-Attention outperforms about 30 state-of-the-art
non-attention, standard Bi-Attention, contextual Bi-Attention approaches and
pretrained neural language models1
Growth of Large Domain Epitaxial Graphene on the C-Face of SiC
Growth of epitaxial graphene on the C-face of SiC has been investigated.
Using a confinement controlled sublimation (CCS) method, we have achieved well
controlled growth and been able to observe propagation of uniform monolayer
graphene. Surface patterns uncover two important aspects of the growth, i.e.
carbon diffusion and stoichiometric requirement. Moreover, a new "stepdown"
growth mode has been discovered. Via this mode, monolayer graphene domains can
have an area of hundreds of square micrometers, while, most importantly, step
bunching is avoided and the initial uniformly stepped SiC surface is preserved.
The stepdown growth provides a possible route towards uniform epitaxial
graphene in wafer size without compromising the initial flat surface morphology
of SiC.Comment: 18 pages, 8 figure
Timestamp Error Detection and Estimation for PMU Data based on Linear Correlation between Relative Phase Angle and Frequency
Time synchronization is essential to synchro-phasor-based applications. However, Timestamp Error (TE) in synchrophasor data can result in application failures. This paper proposes a method for TE detection based on the linear correlation between frequency and relative phase angle. The TE converts the short-term relative phase angle from noise-like signal to one that linear with the frequency. Pearson Correlation Coefficient (PCC) is applied to measure the linear correlation and then detect the timestamp error. The time error is estimated based on the variation of frequency and relative phase angle. Case studies with actual synchrophasor data demonstrate the effectiveness of TE detection and excellent accuracy of TE estimation
Learning to Select from Multiple Options
Many NLP tasks can be regarded as a selection problem from a set of options,
such as classification tasks, multi-choice question answering, etc. Textual
entailment (TE) has been shown as the state-of-the-art (SOTA) approach to
dealing with those selection problems. TE treats input texts as premises (P),
options as hypotheses (H), then handles the selection problem by modeling (P,
H) pairwise. Two limitations: first, the pairwise modeling is unaware of other
options, which is less intuitive since humans often determine the best options
by comparing competing candidates; second, the inference process of pairwise TE
is time-consuming, especially when the option space is large. To deal with the
two issues, this work first proposes a contextualized TE model (Context-TE) by
appending other k options as the context of the current (P, H) modeling.
Context-TE is able to learn more reliable decision for the H since it considers
various context. Second, we speed up Context-TE by coming up with Parallel-TE,
which learns the decisions of multiple options simultaneously. Parallel-TE
significantly improves the inference speed while keeping comparable performance
with Context-TE. Our methods are evaluated on three tasks (ultra-fine entity
typing, intent detection and multi-choice QA) that are typical selection
problems with different sizes of options. Experiments show our models set new
SOTA performance; particularly, Parallel-TE is faster than the pairwise TE by k
times in inference. Our code is publicly available at
https://github.com/jiangshdd/LearningToSelect.Comment: Accepted by AAAI 202
Non-intrusive Load Monitoring based on Self-supervised Learning
Deep learning models for non-intrusive load monitoring (NILM) tend to require
a large amount of labeled data for training. However, it is difficult to
generalize the trained models to unseen sites due to different load
characteristics and operating patterns of appliances between data sets. For
addressing such problems, self-supervised learning (SSL) is proposed in this
paper, where labeled appliance-level data from the target data set or house is
not required. Initially, only the aggregate power readings from target data set
are required to pre-train a general network via a self-supervised pretext task
to map aggregate power sequences to derived representatives. Then, supervised
downstream tasks are carried out for each appliance category to fine-tune the
pre-trained network, where the features learned in the pretext task are
transferred. Utilizing labeled source data sets enables the downstream tasks to
learn how each load is disaggregated, by mapping the aggregate to labels.
Finally, the fine-tuned network is applied to load disaggregation for the
target sites. For validation, multiple experimental cases are designed based on
three publicly accessible REDD, UK-DALE, and REFIT data sets. Besides,
state-of-the-art neural networks are employed to perform NILM task in the
experiments. Based on the NILM results in various cases, SSL generally
outperforms zero-shot learning in improving load disaggregation performance
without any sub-metering data from the target data sets.Comment: 12 pages,10 figure
Multi-timescale Event Detection in Nonintrusive Load Monitoring based on MDL Principle
Load event detection is the fundamental step for the event-based
non-intrusive load monitoring (NILM). However, existing event detection methods
with fixed parameters may fail in coping with the inherent multi-timescale
characteristics of events and their event detection accuracy is easily affected
by the load fluctuation. In this regard, this paper extends our previously
designed two-stage event detection framework, and proposes a novel
multi-timescale event detection method based on the principle of minimum
description length (MDL). Following the completion of step-like event detection
in the first stage, a long-transient event detection scheme with
variable-length sliding window is designed for the second stage, which is
intended to provide the observation and characterization of the same event at
different time scales. In that, the context information in the aggregated load
data is mined by motif discovery, and then based on the MDL principle, the
proper observation scales are selected for different events and the
corresponding detection results are determined. In the post-processing step, a
load fluctuation location method based on voice activity detection (VAD) is
proposed to identify and remove the unreasonable events caused by fluctuations.
Based on newly proposed evaluation metrics, the comparison tests on public and
private datasets demonstrate that our method achieves higher detection accuracy
and integrity for events of various appliances across different scenarios.Comment: 11 pages,16 figure
- …