8,349 research outputs found
Learning to Reconstruct Shapes from Unseen Classes
From a single image, humans are able to perceive the full 3D shape of an
object by exploiting learned shape priors from everyday life. Contemporary
single-image 3D reconstruction algorithms aim to solve this task in a similar
fashion, but often end up with priors that are highly biased by training
classes. Here we present an algorithm, Generalizable Reconstruction (GenRe),
designed to capture more generic, class-agnostic shape priors. We achieve this
with an inference network and training procedure that combine 2.5D
representations of visible surfaces (depth and silhouette), spherical shape
representations of both visible and non-visible surfaces, and 3D voxel-based
representations, in a principled manner that exploits the causal structure of
how 3D shapes give rise to 2D images. Experiments demonstrate that GenRe
performs well on single-view shape reconstruction, and generalizes to diverse
novel objects from categories not seen during training.Comment: NeurIPS 2018 (Oral). The first two authors contributed equally to
this paper. Project page: http://genre.csail.mit.edu
Motion from "X" by Compensating "Y"
This paper analyzes the geometry of the visual motion estimation problem in relation to transformations of the input (images) that stabilize particular output functions such as the motion of a point, a line and a plane in the image. By casting the problem within the popular "epipolar geometry", we provide a common framework for including constraints such as point, line of plane fixation by just considering "slices" of the parameter manifold. The models we provide can be used for estimating motion from a batch using the preferred optimization techniques, or for defining dynamic filters that estimate motion from a causal sequence. We discuss methods for performing the necessary compensation by either controlling the support of the camera or by pre-processing the images. The compensation algorithms may be used also for recursively fitting a plane in 3-D both from point-features or directly from brightness. Conversely, they may be used for estimating motion relative to the plane independent of its parameters
Reducing "Structure From Motion": a General Framework for Dynamic Vision - Part 1: Modeling
The literature on recursive estimation of structure and motion from monocular image sequences comprises a large number of different models and estimation techniques. We propose a framework that allows us to derive and compare all models by following the idea of dynamical system reduction.
The "natural" dynamic model, derived by the rigidity constraint and the perspective projection, is first reduced by explicitly decoupling structure (depth) from motion. Then implicit decoupling techniques are explored, which consist of imposing that some function of the unknown parameters is held constant. By appropriately choosing such a function, not only can we account for all models seen so far in the literature, but we can also derive novel ones
Exocentric direction judgements in computer-generated displays and actual scenes
One of the most remarkable perceptual properties of common experience is that the perceived shapes of known objects are constant despite movements about them which transform their projections on the retina. This perceptual ability is one aspect of shape constancy (Thouless, 1931; Metzger, 1953; Borresen and Lichte, 1962). It requires that the viewer be able to sense and discount his or her relative position and orientation with respect to a viewed object. This discounting of relative position may be derived directly from the ranging information provided from stereopsis, from motion parallax, from vestibularly sensed rotation and translation, or from corollary information associated with voluntary movement. It is argued that: (1) errors in exocentric judgements of the azimuth of a target generated on an electronic perspective display are not viewpoint-independent, but are influenced by the specific geometry of their perspective projection; (2) elimination of binocular conflict by replacing electronic displays with actual scenes eliminates a previously reported equidistance tendency in azimuth error, but the viewpoint dependence remains; (3) the pattern of exocentrically judged azimuth error in real scenes viewed with a viewing direction depressed 22 deg and rotated + or - 22 deg with respect to a reference direction could not be explained by overestimation of the depression angle, i.e., a slant overestimation
What May Visualization Processes Optimize?
In this paper, we present an abstract model of visualization and inference
processes and describe an information-theoretic measure for optimizing such
processes. In order to obtain such an abstraction, we first examined six
classes of workflows in data analysis and visualization, and identified four
levels of typical visualization components, namely disseminative,
observational, analytical and model-developmental visualization. We noticed a
common phenomenon at different levels of visualization, that is, the
transformation of data spaces (referred to as alphabets) usually corresponds to
the reduction of maximal entropy along a workflow. Based on this observation,
we establish an information-theoretic measure of cost-benefit ratio that may be
used as a cost function for optimizing a data visualization process. To
demonstrate the validity of this measure, we examined a number of successful
visualization processes in the literature, and showed that the
information-theoretic measure can mathematically explain the advantages of such
processes over possible alternatives.Comment: 10 page
- …