Search CORE

840 research outputs found

Learning sparse representations of depth

Author: Culpepper Benjamin J.
Olshausen Bruno A.
Tosic Ivana
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/04/2011
Field of study

This paper introduces a new method for learning and inferring sparse representations of depth (disparity) maps. The proposed algorithm relaxes the usual assumption of the stationary noise model in sparse coding. This enables learning from data corrupted with spatially varying noise or uncertainty, typically obtained by laser range scanners or structured light depth cameras. Sparse representations are learned from the Middlebury database disparity maps and then exploited in a two-layer graphical model for inferring depth from stereo, by including a sparsity prior on the learned features. Since they capture higher-order dependencies in the depth structure, these priors can complement smoothness priors commonly used in depth inference based on Markov Random Field (MRF) models. Inference on the proposed graph is achieved using an alternating iterative optimization technique, where the first layer is solved using an existing MRF-based stereo matching algorithm, then held fixed as the second layer is solved using the proposed non-stationary sparse coding algorithm. This leads to a general method for improving solutions of state of the art MRF-based depth estimation algorithms. Our experimental results first show that depth inference using learned representations leads to state of the art denoising of depth maps obtained from laser range scanners and a time of flight camera. Furthermore, we show that adding sparse priors improves the results of two depth estimation methods: the classical graph cut algorithm by Boykov et al. and the more recent algorithm of Woodford et al.Comment: 12 page

arXiv.org e-Print Archive

Crossref

GazeStereo3D: seamless disparity manipulations

Author: Didyk Piotr
Hefeeda Mohamed M.
Kellnhofer Petr
Matusik Wojciech
Myszkowski Karol
Seidel Hans-Peter
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

Producing a high quality stereoscopic impression on current displays is a challenging task. The content has to be carefully prepared in order to maintain visual comfort, which typically affects the quality of depth reproduction. In this work, we show that this problem can be significantly alleviated when the eye fixation regions can be roughly estimated. We propose a new method for stereoscopic depth adjustment that utilizes eye tracking or other gaze prediction information. The key idea that distinguishes our approach from the previous work is to apply gradual depth adjustments at the eye fixation stage, so that they remain unnoticeable. To this end, we measure the limits imposed on the speed of disparity changes in various depth adjustment scenarios, and formulate a new model that can guide such seamless stereoscopic content processing. Based on this model, we propose a real-time controller that applies local manipulations to stereoscopic content to find the optimum between depth reproduction and visual comfort. We show that the controller is mostly immune to the limitations of low-cost eye tracking solutions. We also demonstrate benefits of our model in off-line applications, such as stereoscopic movie production, where skillful directors can reliably guide and predict viewers' attention or where attended image regions are identified during eye tracking sessions. We validate both our model and the controller in a series of user experiments. They show significant improvements in depth perception without sacrificing the visual quality when our techniques are applied

DSpace@MIT

Crossref

MPG.PuRe

A hierarchical system for a distributed representation of the peripersonal space of a humanoid robot

Author: Antonelli Marco
Beuth Frederik
Canessa Andrea
Chessa Manuela
Chinellato Eris
Del Pobil Angel P.
Duran Angel J.
Gibaldi Agostino
Hamker Fred
Sabatini Silvio P.
Solari Fabio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

Reaching a target object in an unknown and unstructured environment is easily performed by human beings. However, designing a humanoid robot that executes the same task requires the implementation of complex abilities, such as identifying the target in the visual field, estimating its spatial location, and precisely driving the motors of the arm to reach it. While research usually tackles the development of such abilities singularly, in this work we integrate a number of computational models into a unified framework, and demonstrate in a humanoid torso the feasibility of an integrated working representation of its peripersonal space. To achieve this goal, we propose a cognitive architecture that connects several models inspired by neural circuits of the visual, frontal and posterior parietal cortices of the brain. The outcome of the integration process is a system that allows the robot to create its internal model and its representation of the surrounding space by interacting with the environment directly, through a mutual adaptation of perception and action. The robot is eventually capable of executing a set of tasks, such as recognizing, gazing and reaching target objects, which can work separately or cooperate for supporting more structured and effective behaviors

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori Institucional de la Universitat Jaume I

Middlesex University Research Repository

Archivio istituzionale della ricerca - Università di Genova

A hierarchical system for a distributed representation of the peripersonal space of a humanoid robot

Author: Antonelli M.
Antonelli M.
Beuth F.
Beuth F.
Canessa A.
Canessa A.
Chessa M.
Chessa M.
Chinellato E.
Chinellato E.
Del Pobil A.
Del Pobil A.
Duran A.
Duran A.
Gibaldi A.
Gibaldi A.
Hamker F.
Hamker F.
Sabatini S.
Sabatini S.
Solari F.
Solari F.
Publication venue: Institute of Electrical and Electronics Engineers (IEEE)
Publication date: 01/01/2014
Field of study

Middlesex University Research Repository

General Dynamic Scene Reconstruction from Multiple View Video

Author: Guillemaut Jean-Yves
Hilton Adrian
Kim Hansung
Mustafa Armin
Publication venue
Publication date: 30/09/2015
Field of study

This paper introduces a general approach to dynamic scene reconstruction from multiple moving cameras without prior knowledge or limiting constraints on the scene structure, appearance, or illumination. Existing techniques for dynamic scene reconstruction from multiple wide-baseline camera views primarily focus on accurate reconstruction in controlled environments, where the cameras are fixed and calibrated and background is known. These approaches are not robust for general dynamic scenes captured with sparse moving cameras. Previous approaches for outdoor dynamic scene reconstruction assume prior knowledge of the static background appearance and structure. The primary contributions of this paper are twofold: an automatic method for initial coarse dynamic scene segmentation and reconstruction without prior knowledge of background appearance or structure; and a general robust approach for joint segmentation refinement and dense reconstruction of dynamic scenes from multiple wide-baseline static or moving cameras. Evaluation is performed on a variety of indoor and outdoor scenes with cluttered backgrounds and multiple dynamic non-rigid objects such as people. Comparison with state-of-the-art approaches demonstrates improved accuracy in both multiple view segmentation and dense reconstruction. The proposed approach also eliminates the requirement for prior knowledge of scene structure and appearance

arXiv.org e-Print Archive

Crossref

Southampton (e-Prints Soton)

University of Surrey

Surrey Research Insight

Comfort-driven disparity adjustment for stereoscopic video

Author: Liang Jun-Bang
Martin Ralph Robert
Wang Miao
Zhang Song-Hai
Zhang Xi-Jin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2016
Field of study

Pixel disparity—the offset of corresponding pixels between left and right views—is a crucial parameter in stereoscopic three-dimensional (S3D) video, as it determines the depth perceived by the human visual system (HVS). Unsuitable pixel disparity distribution throughout an S3D video may lead to visual discomfort. We present a unified and extensible stereoscopic video disparity adjustment framework which improves the viewing experience for an S3D video by keeping the perceived 3D appearance as unchanged as possible while minimizing discomfort. We first analyse disparity and motion attributes of S3D video in general, then derive a wide-ranging visual discomfort metric from existing perceptual comfort models. An objective function based on this metric is used as the basis of a hierarchical optimisation method to find a disparity mapping function for each input video frame. Warping-based disparity manipulation is then applied to the input video to generate the output video, using the desired disparity mappings as constraints. Our comfort metric takes into account disparity range, motion, and stereoscopic window violation; the framework could easily be extended to use further visual comfort models. We demonstrate the power of our approach using both animated cartoons and real S3D videos

Crossref

Online Research @ Cardiff

Springer - Publisher Connector

Recommended from our members

Sensorimotor embedding : a developmental approach to learning geometry

Author: Stober Jeremy Michael
Publication venue
Publication date: 03/09/2015
Field of study

textA human infant facing the blooming, buzzing confusion of the senses grows up to be an adult with common-sense knowledge of geometry; this knowledge then allows her to describe the shapes of objects, the layouts of places, and the relative locations of things naturally and effortlessly. In robotics, such knowledge is usually built in by a human designer who needs to solve complex engineering problems of sensor calibration and inference. In contrast, this dissertation presents a model for how autonomous agents can form an understanding of geometry the same way infants do: by learning from early unstructured sensorimotor experience. Through a framework called sensorimotor embedding, an agent reconstructs knowledge of its own sensor structure, the local geometry of the world, and the pose of objects within the world. The validity of this knowledge is demonstrated directly through Procrustes analysis and indirectly by using it to solve the mountain car task with different morphologies. The dissertation demonstrates how sensorimotor embedding can serve as a robust approach for acquiring geometric knowledge.Computer Science

Texas ScholarWorks

Recommended from our members

Spatial Form as Inherently Three Dimensional

Author: Tyler C.W.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 22/03/2012
Field of study

Visual processing is mainly to achieve proper perception of objects. The visual system breaks down sensory information into the discrete elements of the object. This chapter deals with the binding problem, i.e. the process of recombining local sources of object information, and how objects are defined through surface representation and interpolation of object shape with a generic depth map. The issue of transparency and concerns on surface reconstruction are also discussed

City Research Online