81,934 research outputs found
Knowledge Base Population using Semantic Label Propagation
A crucial aspect of a knowledge base population system that extracts new
facts from text corpora, is the generation of training data for its relation
extractors. In this paper, we present a method that maximizes the effectiveness
of newly trained relation extractors at a minimal annotation cost. Manual
labeling can be significantly reduced by Distant Supervision, which is a method
to construct training data automatically by aligning a large text corpus with
an existing knowledge base of known facts. For example, all sentences
mentioning both 'Barack Obama' and 'US' may serve as positive training
instances for the relation born_in(subject,object). However, distant
supervision typically results in a highly noisy training set: many training
sentences do not really express the intended relation. We propose to combine
distant supervision with minimal manual supervision in a technique called
feature labeling, to eliminate noise from the large and noisy initial training
set, resulting in a significant increase of precision. We further improve on
this approach by introducing the Semantic Label Propagation method, which uses
the similarity between low-dimensional representations of candidate training
instances, to extend the training set in order to increase recall while
maintaining high precision. Our proposed strategy for generating training data
is studied and evaluated on an established test collection designed for
knowledge base population tasks. The experimental results show that the
Semantic Label Propagation strategy leads to substantial performance gains when
compared to existing approaches, while requiring an almost negligible manual
annotation effort.Comment: Submitted to Knowledge Based Systems, special issue on Knowledge
Bases for Natural Language Processin
ZOO: Zeroth Order Optimization based Black-box Attacks to Deep Neural Networks without Training Substitute Models
Deep neural networks (DNNs) are one of the most prominent technologies of our
time, as they achieve state-of-the-art performance in many machine learning
tasks, including but not limited to image classification, text mining, and
speech processing. However, recent research on DNNs has indicated
ever-increasing concern on the robustness to adversarial examples, especially
for security-critical tasks such as traffic sign identification for autonomous
driving. Studies have unveiled the vulnerability of a well-trained DNN by
demonstrating the ability of generating barely noticeable (to both human and
machines) adversarial images that lead to misclassification. Furthermore,
researchers have shown that these adversarial images are highly transferable by
simply training and attacking a substitute model built upon the target model,
known as a black-box attack to DNNs.
Similar to the setting of training substitute models, in this paper we
propose an effective black-box attack that also only has access to the input
(images) and the output (confidence scores) of a targeted DNN. However,
different from leveraging attack transferability from substitute models, we
propose zeroth order optimization (ZOO) based attacks to directly estimate the
gradients of the targeted DNN for generating adversarial examples. We use
zeroth order stochastic coordinate descent along with dimension reduction,
hierarchical attack and importance sampling techniques to efficiently attack
black-box models. By exploiting zeroth order optimization, improved attacks to
the targeted DNN can be accomplished, sparing the need for training substitute
models and avoiding the loss in attack transferability. Experimental results on
MNIST, CIFAR10 and ImageNet show that the proposed ZOO attack is as effective
as the state-of-the-art white-box attack and significantly outperforms existing
black-box attacks via substitute models.Comment: Accepted by 10th ACM Workshop on Artificial Intelligence and Security
(AISEC) with the 24th ACM Conference on Computer and Communications Security
(CCS
- …