3,085 research outputs found
Seeing Tree Structure from Vibration
Humans recognize object structure from both their appearance and motion;
often, motion helps to resolve ambiguities in object structure that arise when
we observe object appearance only. There are particular scenarios, however,
where neither appearance nor spatial-temporal motion signals are informative:
occluding twigs may look connected and have almost identical movements, though
they belong to different, possibly disconnected branches. We propose to tackle
this problem through spectrum analysis of motion signals, because vibrations of
disconnected branches, though visually similar, often have distinctive natural
frequencies. We propose a novel formulation of tree structure based on a
physics-based link model, and validate its effectiveness by theoretical
analysis, numerical simulation, and empirical experiments. With this
formulation, we use nonparametric Bayesian inference to reconstruct tree
structure from both spectral vibration signals and appearance cues. Our model
performs well in recognizing hierarchical tree structure from real-world videos
of trees and vessels.Comment: ECCV 2018. The first two authors contributed equally to this work.
Project page: http://tree.csail.mit.edu
Coherent multi-dimensional segmentation of multiview images using a variational framework and applications to image based rendering
Image Based Rendering (IBR) and in particular light field rendering has attracted a lot of
attention for interpolating new viewpoints from a set of multiview images. New images of
a scene are interpolated directly from nearby available ones, thus enabling a photorealistic
rendering. Sampling theory for light fields has shown that exact geometric information
in the scene is often unnecessary for rendering new views. Indeed, the band of the function
is approximately limited and new views can be rendered using classical interpolation
methods. However, IBR using undersampled light fields suffers from aliasing effects and
is difficult particularly when the scene has large depth variations and occlusions. In order
to deal with these cases, we study two approaches:
New sampling schemes have recently emerged that are able to perfectly reconstruct
certain classes of parametric signals that are not bandlimited but characterized by a finite
number of parameters. In this context, we derive novel sampling schemes for piecewise
sinusoidal and polynomial signals. In particular, we show that a piecewise sinusoidal signal
with arbitrarily high frequencies can be exactly recovered given certain conditions. These
results are applied to parametric multiview data that are not bandlimited.
We also focus on the problem of extracting regions (or layers) in multiview images
that can be individually rendered free of aliasing. The problem is posed in a multidimensional
variational framework using region competition. In extension to previous
methods, layers are considered as multi-dimensional hypervolumes. Therefore the segmentation
is done jointly over all the images and coherence is imposed throughout the
data. However, instead of propagating active hypersurfaces, we derive a semi-parametric
methodology that takes into account the constraints imposed by the camera setup and the
occlusion ordering. The resulting framework is a global multi-dimensional region competition that is consistent in all the images and efficiently handles occlusions. We show the
validity of the approach with captured light fields. Other special effects such as augmented
reality and disocclusion of hidden objects are also demonstrated
An object-based approach to image/video-based synthesis and processing for 3-D and multiview televisions
This paper proposes an object-based approach to a class of dynamic image-based representations called "plenoptic videos," where the plenoptic video sequences are segmented into image-based rendering (IBR) objects each with its image sequence, depth map, and other relevant information such as shape and alpha information. This allows desirable functionalities such as scalability of contents, error resilience, and interactivity with individual IBR objects to be supported. Moreover, the rendering quality in scenes with large depth variations can also be improved considerably. A portable capturing system consisting of two linear camera arrays was developed to verify the proposed approach. An important step in the object-based approach is to segment the objects in video streams into layers or IBR objects. To reduce the time for segmenting plenoptic videos under the semiautomatic technique, a new object tracking method based on the level-set method is proposed. Due to possible segmentation errors around object boundaries, natural matting with Bayesian approach is also incorporated into our system. Furthermore, extensions of conventional image processing algorithms to these IBR objects are studied and illustrated with examples. Experimental results are given to illustrate the efficiency of the tracking, matting, rendering, and processing algorithms under the proposed object-based framework. © 2009 IEEE.published_or_final_versio
The effect of transparency on recognition of overlapping objects
Are overlapping objects easier to recognize when the objects are transparent or opaque? It is important to know whether the transparency of X-ray images of luggage contributes to the difficulty in searching those images for targets. Transparency provides extra information about objects that would normally be occluded but creates potentially ambiguous depth relations at the region of overlap. Two experiments investigated the threshold durations at which adult participants could accurately name pairs of overlapping objects that were opaque or transparent. In Experiment 1, the transparent displays included monocular cues to relative depth. Recognition of the back object was possible at shorter durations for transparent displays than for opaque displays. In Experiment 2, the transparent displays had no monocular depth cues. There was no difference in the duration at which the back object was recognized across transparent and opaque displays. The results of the two experiments suggest that transparent displays, even though less familiar than opaque displays, do not make object recognition more difficult, and possibly show a benefit. These findings call into question the importance of edge junctions in object recognitio
From Stereogram to Surface: How the Brain Sees the World in Depth
When we look at a scene, how do we consciously see surfaces infused with lightness and color at the correct depths? Random Dot Stereograms (RDS) probe how binocular disparity between the two eyes can generate such conscious surface percepts. Dense RDS do so despite the fact that they include multiple false binocular matches. Sparse stereograms do so even across large contrast-free regions with no binocular matches. Stereograms that define occluding and occluded surfaces lead to surface percepts wherein partially occluded textured surfaces are completed behind occluding textured surfaces at a spatial scale much larger than that of the texture elements themselves. Earlier models suggest how the brain detects binocular disparity, but not how RDS generate conscious percepts of 3D surfaces. A neural model predicts how the layered circuits of visual cortex generate these 3D surface percepts using interactions between visual boundary and surface representations that obey complementary computational rules.Air Force Office of Scientific Research (F49620-01-1-0397); National Science Foundation (EIA-01-30851, SBE-0354378); Office of Naval Research (N00014-01-1-0624
- …