51,870 research outputs found
Deep Neural Network and Data Augmentation Methodology for off-axis iris segmentation in wearable headsets
A data augmentation methodology is presented and applied to generate a large
dataset of off-axis iris regions and train a low-complexity deep neural
network. Although of low complexity the resulting network achieves a high level
of accuracy in iris region segmentation for challenging off-axis eye-patches.
Interestingly, this network is also shown to achieve high levels of performance
for regular, frontal, segmentation of iris regions, comparing favorably with
state-of-the-art techniques of significantly higher complexity. Due to its
lower complexity, this network is well suited for deployment in embedded
applications such as augmented and mixed reality headsets
The Right (Angled) Perspective: Improving the Understanding of Road Scenes Using Boosted Inverse Perspective Mapping
Many tasks performed by autonomous vehicles such as road marking detection,
object tracking, and path planning are simpler in bird's-eye view. Hence,
Inverse Perspective Mapping (IPM) is often applied to remove the perspective
effect from a vehicle's front-facing camera and to remap its images into a 2D
domain, resulting in a top-down view. Unfortunately, however, this leads to
unnatural blurring and stretching of objects at further distance, due to the
resolution of the camera, limiting applicability. In this paper, we present an
adversarial learning approach for generating a significantly improved IPM from
a single camera image in real time. The generated bird's-eye-view images
contain sharper features (e.g. road markings) and a more homogeneous
illumination, while (dynamic) objects are automatically removed from the scene,
thus revealing the underlying road layout in an improved fashion. We
demonstrate our framework using real-world data from the Oxford RobotCar
Dataset and show that scene understanding tasks directly benefit from our
boosted IPM approach.Comment: equal contribution of first two authors, 8 full pages, 6 figures,
accepted at IV 201
FoveaBox: Beyond Anchor-based Object Detector
We present FoveaBox, an accurate, flexible, and completely anchor-free
framework for object detection. While almost all state-of-the-art object
detectors utilize predefined anchors to enumerate possible locations, scales
and aspect ratios for the search of the objects, their performance and
generalization ability are also limited to the design of anchors. Instead,
FoveaBox directly learns the object existing possibility and the bounding box
coordinates without anchor reference. This is achieved by: (a) predicting
category-sensitive semantic maps for the object existing possibility, and (b)
producing category-agnostic bounding box for each position that potentially
contains an object. The scales of target boxes are naturally associated with
feature pyramid representations. In FoveaBox, an instance is assigned to
adjacent feature levels to make the model more accurate.We demonstrate its
effectiveness on standard benchmarks and report extensive experimental
analysis. Without bells and whistles, FoveaBox achieves state-of-the-art single
model performance on the standard COCO and Pascal VOC object detection
benchmark. More importantly, FoveaBox avoids all computation and
hyper-parameters related to anchor boxes, which are often sensitive to the
final detection performance. We believe the simple and effective approach will
serve as a solid baseline and help ease future research for object detection.
The code has been made publicly available at
https://github.com/taokong/FoveaBox .Comment: IEEE Transactions on Image Processing, code at:
https://github.com/taokong/FoveaBo
Perception Driven Texture Generation
This paper investigates a novel task of generating texture images from
perceptual descriptions. Previous work on texture generation focused on either
synthesis from examples or generation from procedural models. Generating
textures from perceptual attributes have not been well studied yet. Meanwhile,
perceptual attributes, such as directionality, regularity and roughness are
important factors for human observers to describe a texture. In this paper, we
propose a joint deep network model that combines adversarial training and
perceptual feature regression for texture generation, while only random noise
and user-defined perceptual attributes are required as input. In this model, a
preliminary trained convolutional neural network is essentially integrated with
the adversarial framework, which can drive the generated textures to possess
given perceptual attributes. An important aspect of the proposed model is that,
if we change one of the input perceptual features, the corresponding appearance
of the generated textures will also be changed. We design several experiments
to validate the effectiveness of the proposed method. The results show that the
proposed method can produce high quality texture images with desired perceptual
properties.Comment: 7 pages, 4 figures, icme201
- …