1,345 research outputs found
Is Deep Learning Safe for Robot Vision? Adversarial Examples against the iCub Humanoid
Deep neural networks have been widely adopted in recent years, exhibiting
impressive performances in several application domains. It has however been
shown that they can be fooled by adversarial examples, i.e., images altered by
a barely-perceivable adversarial noise, carefully crafted to mislead
classification. In this work, we aim to evaluate the extent to which
robot-vision systems embodying deep-learning algorithms are vulnerable to
adversarial examples, and propose a computationally efficient countermeasure to
mitigate this threat, based on rejecting classification of anomalous inputs. We
then provide a clearer understanding of the safety properties of deep networks
through an intuitive empirical analysis, showing that the mapping learned by
such networks essentially violates the smoothness assumption of learning
algorithms. We finally discuss the main limitations of this work, including the
creation of real-world adversarial examples, and sketch promising research
directions.Comment: Accepted for publication at the ICCV 2017 Workshop on Vision in
Practice on Autonomous Robots (ViPAR
Discriminatively Trained Latent Ordinal Model for Video Classification
We study the problem of video classification for facial analysis and human
action recognition. We propose a novel weakly supervised learning method that
models the video as a sequence of automatically mined, discriminative
sub-events (eg. onset and offset phase for "smile", running and jumping for
"highjump"). The proposed model is inspired by the recent works on Multiple
Instance Learning and latent SVM/HCRF -- it extends such frameworks to model
the ordinal aspect in the videos, approximately. We obtain consistent
improvements over relevant competitive baselines on four challenging and
publicly available video based facial analysis datasets for prediction of
expression, clinical pain and intent in dyadic conversations and on three
challenging human action datasets. We also validate the method with qualitative
results and show that they largely support the intuitions behind the method.Comment: Paper accepted in IEEE TPAMI. arXiv admin note: substantial text
overlap with arXiv:1604.0150
Reliable camera motion estimation from compressed MPEG videos using machine learning approach
As an important feature in characterizing video content, camera motion has been widely applied in various multimedia and computer vision applications. A novel method for fast and reliable estimation of camera motion from MPEG videos is proposed, using support vector machine for estimation in a regression model trained on a synthesized sequence. Experiments conducted on real sequences show that the proposed method yields much improved results in estimating camera motions while the difficulty in selecting valid macroblocks and motion vectors is skipped
Iterative Bounding Box Annotation for Object Detection
Manual annotation of bounding boxes for object detection in digital images is
tedious, and time and resource consuming. In this paper, we propose a
semi-automatic method for efficient bounding box annotation. The method trains
the object detector iteratively on small batches of labeled images and learns
to propose bounding boxes for the next batch, after which the human annotator
only needs to correct possible errors. We propose an experimental setup for
simulating the human actions and use it for comparing different iteration
strategies, such as the order in which the data is presented to the annotator.
We experiment on our method with three datasets and show that it can reduce the
human annotation effort significantly, saving up to 75% of total manual
annotation work.Comment: Accepted at ICPR 202
- âŠ