2,747 research outputs found
Joint Learning of Intrinsic Images and Semantic Segmentation
Semantic segmentation of outdoor scenes is problematic when there are
variations in imaging conditions. It is known that albedo (reflectance) is
invariant to all kinds of illumination effects. Thus, using reflectance images
for semantic segmentation task can be favorable. Additionally, not only
segmentation may benefit from reflectance, but also segmentation may be useful
for reflectance computation. Therefore, in this paper, the tasks of semantic
segmentation and intrinsic image decomposition are considered as a combined
process by exploring their mutual relationship in a joint fashion. To that end,
we propose a supervised end-to-end CNN architecture to jointly learn intrinsic
image decomposition and semantic segmentation. We analyze the gains of
addressing those two problems jointly. Moreover, new cascade CNN architectures
for intrinsic-for-segmentation and segmentation-for-intrinsic are proposed as
single tasks. Furthermore, a dataset of 35K synthetic images of natural
environments is created with corresponding albedo and shading (intrinsics), as
well as semantic labels (segmentation) assigned to each object/scene. The
experiments show that joint learning of intrinsic image decomposition and
semantic segmentation is beneficial for both tasks for natural scenes. Dataset
and models are available at: https://ivi.fnwi.uva.nl/cv/intrinsegComment: ECCV 201
Biometric presentation attack detection: beyond the visible spectrum
The increased need for unattended authentication in
multiple scenarios has motivated a wide deployment of biometric
systems in the last few years. This has in turn led to the
disclosure of security concerns specifically related to biometric
systems. Among them, presentation attacks (PAs, i.e., attempts
to log into the system with a fake biometric characteristic or
presentation attack instrument) pose a severe threat to the
security of the system: any person could eventually fabricate
or order a gummy finger or face mask to impersonate someone
else. In this context, we present a novel fingerprint presentation
attack detection (PAD) scheme based on i) a new capture device
able to acquire images within the short wave infrared (SWIR)
spectrum, and i i) an in-depth analysis of several state-of-theart
techniques based on both handcrafted and deep learning
features. The approach is evaluated on a database comprising
over 4700 samples, stemming from 562 different subjects and
35 different presentation attack instrument (PAI) species. The
results show the soundness of the proposed approach with a
detection equal error rate (D-EER) as low as 1.35% even in a
realistic scenario where five different PAI species are considered
only for testing purposes (i.e., unknown attacks
The RGB-D Triathlon: Towards Agile Visual Toolboxes for Robots
Deep networks have brought significant advances in robot perception, enabling
to improve the capabilities of robots in several visual tasks, ranging from
object detection and recognition to pose estimation, semantic scene
segmentation and many others. Still, most approaches typically address visual
tasks in isolation, resulting in overspecialized models which achieve strong
performances in specific applications but work poorly in other (often related)
tasks. This is clearly sub-optimal for a robot which is often required to
perform simultaneously multiple visual recognition tasks in order to properly
act and interact with the environment. This problem is exacerbated by the
limited computational and memory resources typically available onboard to a
robotic platform. The problem of learning flexible models which can handle
multiple tasks in a lightweight manner has recently gained attention in the
computer vision community and benchmarks supporting this research have been
proposed. In this work we study this problem in the robot vision context,
proposing a new benchmark, the RGB-D Triathlon, and evaluating state of the art
algorithms in this novel challenging scenario. We also define a new evaluation
protocol, better suited to the robot vision setting. Results shed light on the
strengths and weaknesses of existing approaches and on open issues, suggesting
directions for future research.Comment: This work has been submitted to IROS/RAL 201
- …