26,539 research outputs found
Crowdsourcing Semantic Label Propagation in Relation Classification
Distant supervision is a popular method for performing relation extraction
from text that is known to produce noisy labels. Most progress in relation
extraction and classification has been made with crowdsourced corrections to
distant-supervised labels, and there is evidence that indicates still more
would be better. In this paper, we explore the problem of propagating human
annotation signals gathered for open-domain relation classification through the
CrowdTruth methodology for crowdsourcing, that captures ambiguity in
annotations by measuring inter-annotator disagreement. Our approach propagates
annotations to sentences that are similar in a low dimensional embedding space,
expanding the number of labels by two orders of magnitude. Our experiments show
significant improvement in a sentence-level multi-class relation classifier.Comment: In publication at the First Workshop on Fact Extraction and
Verification (FeVer) at EMNLP 201
Query Resolution for Conversational Search with Limited Supervision
In this work we focus on multi-turn passage retrieval as a crucial component
of conversational search. One of the key challenges in multi-turn passage
retrieval comes from the fact that the current turn query is often
underspecified due to zero anaphora, topic change, or topic return. Context
from the conversational history can be used to arrive at a better expression of
the current turn query, defined as the task of query resolution. In this paper,
we model the query resolution task as a binary term classification problem: for
each term appearing in the previous turns of the conversation decide whether to
add it to the current turn query or not. We propose QuReTeC (Query Resolution
by Term Classification), a neural query resolution model based on bidirectional
transformers. We propose a distant supervision method to automatically generate
training data by using query-passage relevance labels. Such labels are often
readily available in a collection either as human annotations or inferred from
user interactions. We show that QuReTeC outperforms state-of-the-art models,
and furthermore, that our distant supervision method can be used to
substantially reduce the amount of human-curated data required to train
QuReTeC. We incorporate QuReTeC in a multi-turn, multi-stage passage retrieval
architecture and demonstrate its effectiveness on the TREC CAsT dataset.Comment: SIGIR 2020 full conference pape
Attention-Based LSTM for Psychological Stress Detection from Spoken Language Using Distant Supervision
We propose a Long Short-Term Memory (LSTM) with attention mechanism to
classify psychological stress from self-conducted interview transcriptions. We
apply distant supervision by automatically labeling tweets based on their
hashtag content, which complements and expands the size of our corpus. This
additional data is used to initialize the model parameters, and which it is
fine-tuned using the interview data. This improves the model's robustness,
especially by expanding the vocabulary size. The bidirectional LSTM model with
attention is found to be the best model in terms of accuracy (74.1%) and
f-score (74.3%). Furthermore, we show that distant supervision fine-tuning
enhances the model's performance by 1.6% accuracy and 2.1% f-score. The
attention mechanism helps the model to select informative words.Comment: Accepted in ICASSP 201
Spike detection using the continuous wavelet transform
This paper combines wavelet transforms with basic detection theory to develop a new unsupervised method for robustly detecting and localizing spikes in noisy neural recordings. The method does not require the construction of templates, or the supervised setting of thresholds. We present extensive Monte Carlo simulations, based on actual extracellular recordings, to show that this technique surpasses other commonly used methods in a wide variety of recording conditions. We further demonstrate that falsely detected spikes corresponding to our method resemble actual spikes more than the false positives of other techniques such as amplitude thresholding. Moreover, the simplicity of the method allows for nearly real-time execution
- …