1,419 research outputs found
Anytime Stereo Image Depth Estimation on Mobile Devices
Many applications of stereo depth estimation in robotics require the
generation of accurate disparity maps in real time under significant
computational constraints. Current state-of-the-art algorithms force a choice
between either generating accurate mappings at a slow pace, or quickly
generating inaccurate ones, and additionally these methods typically require
far too many parameters to be usable on power- or memory-constrained devices.
Motivated by these shortcomings, we propose a novel approach for disparity
prediction in the anytime setting. In contrast to prior work, our end-to-end
learned approach can trade off computation and accuracy at inference time.
Depth estimation is performed in stages, during which the model can be queried
at any time to output its current best estimate. Our final model can process
1242375 resolution images within a range of 10-35 FPS on an NVIDIA
Jetson TX2 module with only marginal increases in error -- using two orders of
magnitude fewer parameters than the most competitive baseline. The source code
is available at https://github.com/mileyan/AnyNet .Comment: Accepted by ICRA201
Real-time single image depth perception in the wild with handheld devices
Depth perception is paramount to tackle real-world problems, ranging from
autonomous driving to consumer applications. For the latter, depth estimation
from a single image represents the most versatile solution, since a standard
camera is available on almost any handheld device. Nonetheless, two main issues
limit its practical deployment: i) the low reliability when deployed
in-the-wild and ii) the demanding resource requirements to achieve real-time
performance, often not compatible with such devices. Therefore, in this paper,
we deeply investigate these issues showing how they are both addressable
adopting appropriate network design and training strategies -- also outlining
how to map the resulting networks on handheld devices to achieve real-time
performance. Our thorough evaluation highlights the ability of such fast
networks to generalize well to new environments, a crucial feature required to
tackle the extremely varied contexts faced in real applications. Indeed, to
further support this evidence, we report experimental results concerning
real-time depth-aware augmented reality and image blurring with smartphones
in-the-wild.Comment: 11 pages, 9 figure
Pushing the efficiency of StereoNet: exploiting spatial sparsity
Current CNN-based stereo matching methods have demonstrated superior performance compared to traditional stereo matching methods. However, mapping these algorithms into embedded devices, which exhibit limited compute resources, and achieving high performance is a challenging task due to the high computational complexity of the CNN-based methods. The recently proposed StereoNet network, achieves disparity estimation with reduced complexity, whereas performance does not greatly deteriorate. Towards pushing this performance to complexity trade-off further, we propose an optimization applied to StereoNet that adapts the computations to the input data, steering the computations to the regions of the input that would benefit from the application of the CNN-based stereo matching algorithm, where the rest of the input is processed by a traditional, less computationally demanding method. Key to the proposed methodology is the introduction of a lightweight CNN that predicts the importance of r efining a region of the input to the quality of the final disparity map, allowing the system to trade-off computational complexity for disparity error on-demand, enabling the application of these methods to embedded systems with real-time requirements
Portable and Scalable In-vehicle Laboratory Instrumentation for the Design of i-ADAS
According to the WHO (World Health Organization), world-wide deaths from injuries are projected to rise from 5.1 million in 1990 to 8.4 million in 2020, with traffic-related incidents as the major cause for this increase. Intelligent, Advanced Driving Assis tance Systems (i-ADAS) provide a number of solutions to these safety challenges. We developed a scalable in-vehicle mobile i-ADAS research platform for the purpose of traffic context analysis and behavioral prediction designed for understanding fun damental issues in intelligent vehicles. We outline our approach and describe the in-vehicle instrumentation
2T-UNET: A Two-Tower UNet with Depth Clues for Robust Stereo Depth Estimation
Stereo correspondence matching is an essential part of the multi-step stereo
depth estimation process. This paper revisits the depth estimation problem,
avoiding the explicit stereo matching step using a simple two-tower
convolutional neural network. The proposed algorithm is entitled as 2T-UNet.
The idea behind 2T-UNet is to replace cost volume construction with twin
convolution towers. These towers have an allowance for different weights
between them. Additionally, the input for twin encoders in 2T-UNet are
different compared to the existing stereo methods. Generally, a stereo network
takes a right and left image pair as input to determine the scene geometry.
However, in the 2T-UNet model, the right stereo image is taken as one input and
the left stereo image along with its monocular depth clue information, is taken
as the other input. Depth clues provide complementary suggestions that help
enhance the quality of predicted scene geometry. The 2T-UNet surpasses
state-of-the-art monocular and stereo depth estimation methods on the
challenging Scene flow dataset, both quantitatively and qualitatively. The
architecture performs incredibly well on complex natural scenes, highlighting
its usefulness for various real-time applications. Pretrained weights and code
will be made readily available
Recommended from our members
Multimedia delivery in the future internet
The term “Networked Media” implies that all kinds of media including text, image, 3D graphics, audio
and video are produced, distributed, shared, managed and consumed on-line through various networks,
like the Internet, Fiber, WiFi, WiMAX, GPRS, 3G and so on, in a convergent manner [1]. This white
paper is the contribution of the Media Delivery Platform (MDP) cluster and aims to cover the Networked
challenges of the Networked Media in the transition to the Future of the Internet.
Internet has evolved and changed the way we work and live. End users of the Internet have been confronted
with a bewildering range of media, services and applications and of technological innovations concerning
media formats, wireless networks, terminal types and capabilities. And there is little evidence that the pace
of this innovation is slowing. Today, over one billion of users access the Internet on regular basis, more
than 100 million users have downloaded at least one (multi)media file and over 47 millions of them do so
regularly, searching in more than 160 Exabytes1 of content. In the near future these numbers are expected
to exponentially rise. It is expected that the Internet content will be increased by at least a factor of 6, rising
to more than 990 Exabytes before 2012, fuelled mainly by the users themselves. Moreover, it is envisaged
that in a near- to mid-term future, the Internet will provide the means to share and distribute (new)
multimedia content and services with superior quality and striking flexibility, in a trusted and personalized
way, improving citizens’ quality of life, working conditions, edutainment and safety.
In this evolving environment, new transport protocols, new multimedia encoding schemes, cross-layer inthe
network adaptation, machine-to-machine communication (including RFIDs), rich 3D content as well as
community networks and the use of peer-to-peer (P2P) overlays are expected to generate new models of
interaction and cooperation, and be able to support enhanced perceived quality-of-experience (PQoE) and
innovative applications “on the move”, like virtual collaboration environments, personalised services/
media, virtual sport groups, on-line gaming, edutainment. In this context, the interaction with content
combined with interactive/multimedia search capabilities across distributed repositories, opportunistic P2P
networks and the dynamic adaptation to the characteristics of diverse mobile terminals are expected to
contribute towards such a vision.
Based on work that has taken place in a number of EC co-funded projects, in Framework Program 6 (FP6)
and Framework Program 7 (FP7), a group of experts and technology visionaries have voluntarily
contributed in this white paper aiming to describe the status, the state-of-the art, the challenges and the way
ahead in the area of Content Aware media delivery platforms
- …