Search CORE

33,592 research outputs found

Disparity and Optical Flow Partitioning Using Extended Potts Priors

Author: Cai Xiaohao
Fitschen Jan Henrik
Nikolova Mila
Steidl Gabriele
Storath Martin
Publication venue: 'Oxford University Press (OUP)'
Publication date: 07/05/2014
Field of study

This paper addresses the problems of disparity and optical flow partitioning based on the brightness invariance assumption. We investigate new variational approaches to these problems with Potts priors and possibly box constraints. For the optical flow partitioning, our model includes vector-valued data and an adapted Potts regularizer. Using the notation of asymptotically level stable functions we prove the existence of global minimizers of our functionals. We propose a modified alternating direction method of minimizers. This iterative algorithm requires the computation of global minimizers of classical univariate Potts problems which can be done efficiently by dynamic programming. We prove that the algorithm converges both for the constrained and unconstrained problems. Numerical examples demonstrate the very good performance of our partitioning method

arXiv.org e-Print Archive

Southampton (e-Prints Soton)

Object-based 2D-to-3D video conversion for effective stereoscopic content generation in 3D-TV applications

Author: Feng Yue
Jiang Jianmin
Ren Jinchang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

Three-dimensional television (3D-TV) has gained increasing popularity in the broadcasting domain, as it enables enhanced viewing experiences in comparison to conventional two-dimensional (2D) TV. However, its application has been constrained due to the lack of essential contents, i.e., stereoscopic videos. To alleviate such content shortage, an economical and practical solution is to reuse the huge media resources that are available in monoscopic 2D and convert them to stereoscopic 3D. Although stereoscopic video can be generated from monoscopic sequences using depth measurements extracted from cues like focus blur, motion and size, the quality of the resulting video may be poor as such measurements are usually arbitrarily defined and appear inconsistent with the real scenes. To help solve this problem, a novel method for object-based stereoscopic video generation is proposed which features i) optical-flow based occlusion reasoning in determining depth ordinal, ii) object segmentation using improved region-growing from masks of determined depth layers, and iii) a hybrid depth estimation scheme using content-based matching (inside a small library of true stereo image pairs) and depth-ordinal based regularization. Comprehensive experiments have validated the effectiveness of our proposed 2D-to-3D conversion method in generating stereoscopic videos of consistent depth measurements for 3D-TV applications

University of Strathclyde Institutional Repository

Surrey Research Insight

Anytime Stereo Image Depth Estimation on Mobile Devices

Author: Campbell Mark
Huang Gao
Lai Zihang
van der Maaten Laurens
Wang Brian H.
Wang Yan
Weinberger Kilian Q.
Publication venue
Publication date: 05/03/2019
Field of study

Many applications of stereo depth estimation in robotics require the generation of accurate disparity maps in real time under significant computational constraints. Current state-of-the-art algorithms force a choice between either generating accurate mappings at a slow pace, or quickly generating inaccurate ones, and additionally these methods typically require far too many parameters to be usable on power- or memory-constrained devices. Motivated by these shortcomings, we propose a novel approach for disparity prediction in the anytime setting. In contrast to prior work, our end-to-end learned approach can trade off computation and accuracy at inference time. Depth estimation is performed in stages, during which the model can be queried at any time to output its current best estimate. Our final model can process 1242

\times

375 resolution images within a range of 10-35 FPS on an NVIDIA Jetson TX2 module with only marginal increases in error -- using two orders of magnitude fewer parameters than the most competitive baseline. The source code is available at https://github.com/mileyan/AnyNet .Comment: Accepted by ICRA201

arXiv.org e-Print Archive

Crossref

Event-based Vision: A Survey

Author: Bartolozzi Chiara
Censi Andrea
Conradt Joerg
Daniilidis Kostas
Davison Andrew
Delbruck Tobi
Gallego Guillermo
Leutenegger Stefan
Orchard Garrick
Scaramuzza Davide
Taba Brian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

ZORA

Estimation of vector fields in unconstrained and inequality constrained variational problems for segmentation and registration

Author: Slabaugh Greg
Ünal Gözde
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2008
Field of study

Vector fields arise in many problems of computer vision, particularly in non-rigid registration. In this paper, we develop coupled partial differential equations (PDEs) to estimate vector fields that define the deformation between objects, and the contour or surface that defines the segmentation of the objects as well.We also explore the utility of inequality constraints applied to variational problems in vision such as estimation of deformation fields in non-rigid registration and tracking. To solve inequality constrained vector field estimation problems, we apply tools from the Kuhn-Tucker theorem in optimization theory. Our technique differs from recently popular joint segmentation and registration algorithms, particularly in its coupled set of PDEs derived from the same set of energy terms for registration and segmentation. We present both the theory and results that demonstrate our approach

Sabanci University Research Database

MoDeep: A Deep Learning Framework Using Motion Features for Human Pose Estimation

Author: Bregler Christoph
Jain Arjun
LeCun Yann
Tompson Jonathan
Publication venue
Publication date: 28/09/2014
Field of study

In this work, we propose a novel and efficient method for articulated human pose estimation in videos using a convolutional network architecture, which incorporates both color and motion features. We propose a new human body pose dataset, FLIC-motion, that extends the FLIC dataset with additional motion features. We apply our architecture to this dataset and report significantly better performance than current state-of-the-art pose detection systems

arXiv.org e-Print Archive

CiteSeerX

Crossref