167 research outputs found
Orientation Analysis in 4D Light Fields
This work is about the analysis of 4D light fields. In the context of this work a light
field is a series of 2D digital images of a scene captured on a planar regular grid of
camera positions. It is essential that the scene is captured over several camera positions
having constant distances to each other. This results in a sampling of light rays emitted
by a single scene point as a function of the camera position. In contrast to traditional
images – measuring the light intensity in the spatial domain – this approach additionally captures directional information leading to the four dimensionality mentioned above.
For image processing, light fields are a relatively new research area. In computer graphics,
they were used to avoid the work-intensive modeling of 3D geometry by instead using
view interpolation to achieve interactive 3D experiences without explicit geometry. The
intention of this work is vice versa, namely using light fields to reconstruct geometry of
a captured scene. The reason is that light fields provide much richer information content
compared to existing approaches of 3D reconstruction. Due to the regular and dense
sampling of the scene, aside from geometry, material properties are also imaged. Surfaces
whose visual appearance change when changing the line of sight causes problems for
known approaches of passive 3D reconstruction. Light fields instead sample this change
in appearance and thus make analysis possible.
This thesis covers different contributions. We propose a new approach to convert raw
data from a light field camera (plenoptic camera 2.0) to a 4D representation without
a pre-computation of pixel-wise depth. This special representation – also called the
Lumigraph – enables an access to epipolar planes which are sub-spaces of the 4D data
structure. An approach is proposed analyzing these epipolar plane images to achieve a
robust depth estimation on Lambertian surfaces. Based on this, an extension is presented
also handling reflective and transparent surfaces. As examples for the usefulness of this
inherently available depth information we show improvements to well known techniques
like super-resolution and object segmentation when extending them to light fields.
Additionally a benchmark database was established over time during the research for
this thesis. We will test the proposed approaches using this database and hope that it
helps to drive future research in this field
Robotic Manipulation under Transparency and Translucency from Light-field Sensing
From frosted windows to plastic containers to refractive fluids, transparency and translucency are prevalent in human environments. The material properties of translucent objects challenge many of our assumptions in robotic perception. For example, the most common RGB-D sensors require the sensing of an infrared structured pattern from a Lambertian reflectance of surfaces. As such, transparent and translucent objects often remain invisible to robot perception. Thus, introducing methods that would enable robots to correctly perceive and then interact with the environment would be highly beneficial. Light-field (or plenoptic) cameras, for instance, which carry light direction and intensity, make it possible to perceive visual clues on transparent and translucent objects. In this dissertation, we explore the inference of transparent and translucent objects from plenoptic observations for robotic perception and manipulation. We propose a novel plenoptic descriptor, Depth Likelihood Volume (DLV), that incorporates plenoptic observations to represent depth of a pixel as a distribution rather than a single value. Building on the DLV, we present the Plenoptic Monte Carlo Localization algorithm, PMCL, as a generative method to infer 6-DoF poses of objects in settings with translucency. PMCL is able to localize both isolated transparent objects and opaque objects behind translucent objects using a DLV computed from a single view plenoptic observation. The uncertainty induced by transparency and translucency for pose estimation increases greatly as scenes become more cluttered. Under this scenario, we propose GlassLoc to localize feasible grasp poses directly from local DLV features. In GlassLoc, a convolutional neural network is introduced to learn DLV features for classifying grasp poses with grasping confidence. GlassLoc also suppresses the reflectance over multi-view plenoptic observations, which leads to more stable DLV representation. We evaluate GlassLoc in the context of a pick-and-place task for transparent tableware in a cluttered tabletop environment. We further observe that the transparent and translucent objects will generate distinguishable features in the light-field epipolar image plane. With this insight, we propose Light-field Inference of Transparency, LIT, as a two-stage generative-discriminative refractive object localization approach. In the discriminative stage, LIT uses convolutional neural networks to learn reflection and distortion features from photorealistic-rendered light-field images. The learned features guide generative object location inference through local depth estimation and particle optimization. We compare LIT with four state-of-the-art pose estimators to show our efficacy in the transparent object localization task. We perform a robot demonstration by building a champagne tower using the LIT pipeline.PHDRoboticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/169707/1/zhezhou_1.pd
Recommended from our members
Surface camera (SCAM) Light Field Rendering
In this article we present a new variant of the light field representation that supports improved image reconstruction by accommodating sparse correspondence information. This places our representation somewhere between a pure, two-plane parameterized, light field and a lumigraph representation, with its continuous geometric proxy. Our approach factors the rays of a light field into one of two separate classes. All rays consistent with a given correspondence are implicitly represented using a new auxiliary data structure, which we call a surface camera, or scam. The remaining rays of the light field are represented using a standard two-plane parameterized light field. We present an efficient rendering algorithm that combines ray samples from scams with those from the light field. The resulting image reconstructions are noticeably improved over that of a pure light field.Engineering and Applied Science
- …