5,214 research outputs found
Object detection via a multi-region & semantic segmentation-aware CNN model
We propose an object detection system that relies on a multi-region deep
convolutional neural network (CNN) that also encodes semantic
segmentation-aware features. The resulting CNN-based representation aims at
capturing a diverse set of discriminative appearance factors and exhibits
localization sensitivity that is essential for accurate object localization. We
exploit the above properties of our recognition module by integrating it on an
iterative localization mechanism that alternates between scoring a box proposal
and refining its location with a deep CNN regression model. Thanks to the
efficient use of our modules, we detect objects with very high localization
accuracy. On the detection challenges of PASCAL VOC2007 and PASCAL VOC2012 we
achieve mAP of 78.2% and 73.9% correspondingly, surpassing any other published
work by a significant margin.Comment: Extended technical report -- short version to appear at ICCV 201
POL-LWIR Vehicle Detection: Convolutional Neural Networks Meet Polarised Infrared Sensors
For vehicle autonomy, driver assistance and situational awareness, it is
necessary to operate at day and night, and in all weather conditions. In
particular, long wave infrared (LWIR) sensors that receive predominantly
emitted radiation have the capability to operate at night as well as during the
day. In this work, we employ a polarised LWIR (POL-LWIR) camera to acquire data
from a mobile vehicle, to compare and contrast four different convolutional
neural network (CNN) configurations to detect other vehicles in video
sequences. We evaluate two distinct and promising approaches, two-stage
detection (Faster-RCNN) and one-stage detection (SSD), in four different
configurations. We also employ two different image decompositions: the first
based on the polarisation ellipse and the second on the Stokes parameters
themselves. To evaluate our approach, the experimental trials were quantified
by mean average precision (mAP) and processing time, showing a clear trade-off
between the two factors. For example, the best mAP result of 80.94% was
achieved using Faster-RCNN, but at a frame rate of 6.4 fps. In contrast,
MobileNet SSD achieved only 64.51% mAP, but at 53.4 fps.Comment: Computer Vision and Pattern Recognition Workshop 201
Analyzing and Interpreting Neural Networks for NLP: A Report on the First BlackboxNLP Workshop
The EMNLP 2018 workshop BlackboxNLP was dedicated to resources and techniques
specifically developed for analyzing and understanding the inner-workings and
representations acquired by neural models of language. Approaches included:
systematic manipulation of input to neural networks and investigating the
impact on their performance, testing whether interpretable knowledge can be
decoded from intermediate representations acquired by neural networks,
proposing modifications to neural network architectures to make their knowledge
state or generated output more explainable, and examining the performance of
networks on simplified or formal languages. Here we review a number of
representative studies in each category
- …