2,515 research outputs found
Knowledge Base Population using Semantic Label Propagation
A crucial aspect of a knowledge base population system that extracts new
facts from text corpora, is the generation of training data for its relation
extractors. In this paper, we present a method that maximizes the effectiveness
of newly trained relation extractors at a minimal annotation cost. Manual
labeling can be significantly reduced by Distant Supervision, which is a method
to construct training data automatically by aligning a large text corpus with
an existing knowledge base of known facts. For example, all sentences
mentioning both 'Barack Obama' and 'US' may serve as positive training
instances for the relation born_in(subject,object). However, distant
supervision typically results in a highly noisy training set: many training
sentences do not really express the intended relation. We propose to combine
distant supervision with minimal manual supervision in a technique called
feature labeling, to eliminate noise from the large and noisy initial training
set, resulting in a significant increase of precision. We further improve on
this approach by introducing the Semantic Label Propagation method, which uses
the similarity between low-dimensional representations of candidate training
instances, to extend the training set in order to increase recall while
maintaining high precision. Our proposed strategy for generating training data
is studied and evaluated on an established test collection designed for
knowledge base population tasks. The experimental results show that the
Semantic Label Propagation strategy leads to substantial performance gains when
compared to existing approaches, while requiring an almost negligible manual
annotation effort.Comment: Submitted to Knowledge Based Systems, special issue on Knowledge
Bases for Natural Language Processin
Fast and Accurate Multiclass Inference for MI-BCIs Using Large Multiscale Temporal and Spectral Features
Accurate, fast, and reliable multiclass classification of
electroencephalography (EEG) signals is a challenging task towards the
development of motor imagery brain-computer interface (MI-BCI) systems. We
propose enhancements to different feature extractors, along with a support
vector machine (SVM) classifier, to simultaneously improve classification
accuracy and execution time during training and testing. We focus on the
well-known common spatial pattern (CSP) and Riemannian covariance methods, and
significantly extend these two feature extractors to multiscale temporal and
spectral cases. The multiscale CSP features achieve 73.7015.90% (mean
standard deviation across 9 subjects) classification accuracy that surpasses
the state-of-the-art method [1], 70.614.70%, on the 4-class BCI
competition IV-2a dataset. The Riemannian covariance features outperform the
CSP by achieving 74.2715.5% accuracy and executing 9x faster in training
and 4x faster in testing. Using more temporal windows for Riemannian features
results in 75.4712.8% accuracy with 1.6x faster testing than CSP.Comment: Published as a conference paper at the IEEE European Signal
Processing Conference (EUSIPCO), 201
Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition
This paper presents a self-supervised method for visual detection of the
active speaker in a multi-person spoken interaction scenario. Active speaker
detection is a fundamental prerequisite for any artificial cognitive system
attempting to acquire language in social settings. The proposed method is
intended to complement the acoustic detection of the active speaker, thus
improving the system robustness in noisy conditions. The method can detect an
arbitrary number of possibly overlapping active speakers based exclusively on
visual information about their face. Furthermore, the method does not rely on
external annotations, thus complying with cognitive development. Instead, the
method uses information from the auditory modality to support learning in the
visual domain. This paper reports an extensive evaluation of the proposed
method using a large multi-person face-to-face interaction dataset. The results
show good performance in a speaker dependent setting. However, in a speaker
independent setting the proposed method yields a significantly lower
performance. We believe that the proposed method represents an essential
component of any artificial cognitive system or robotic platform engaging in
social interactions.Comment: 10 pages, IEEE Transactions on Cognitive and Developmental System
On Network Science and Mutual Information for Explaining Deep Neural Networks
In this paper, we present a new approach to interpret deep learning models.
By coupling mutual information with network science, we explore how information
flows through feedforward networks. We show that efficiently approximating
mutual information allows us to create an information measure that quantifies
how much information flows between any two neurons of a deep learning model. To
that end, we propose NIF, Neural Information Flow, a technique for codifying
information flow that exposes deep learning model internals and provides
feature attributions.Comment: ICASSP 2020 (shorter version appeared at AAAI-19 Workshop on Network
Interpretability for Deep Learning
Extensions of stability selection using subsamples of observations and covariates
We introduce extensions of stability selection, a method to stabilise
variable selection methods introduced by Meinshausen and B\"uhlmann (J R Stat
Soc 72:417-473, 2010). We propose to apply a base selection method repeatedly
to random observation subsamples and covariate subsets under scrutiny, and to
select covariates based on their selection frequency. We analyse the effects
and benefits of these extensions. Our analysis generalizes the theoretical
results of Meinshausen and B\"uhlmann (J R Stat Soc 72:417-473, 2010) from the
case of half-samples to subsamples of arbitrary size. We study, in a
theoretical manner, the effect of taking random covariate subsets using a
simplified score model. Finally we validate these extensions on numerical
experiments on both synthetic and real datasets, and compare the obtained
results in detail to the original stability selection method.Comment: accepted for publication in Statistics and Computin
- …