724 research outputs found
Recommended from our members
Image Understanding and Robotics Research at Columbia University
The research investigations of the Vision/Robotics Laboratory at Columbia University reflect the diversity of interests of its four faculty members, two staff programmers and 15 Ph.D. students. Several of the projects involve either a visiting computer science post-doc, other faculty members in the department or the university, or researchers at AT&T Bell Laboratories or Philips laboratories. We list below a summary of our interest and results, together with the principal researchers associated with them. Since it is difficult to separate those aspects of robotic research that are purely visual from those that are vision-like (for example, tactile sensing) or vision-related (for example, integrated vision-robotic systems), we have listed all robotic research that is not purely manipulative
Unstructured light fields
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (p. 35-38).We present a system for interactively acquiring and rendering light fields using a hand-held commodity camera. The main challenge we address is assisting a user in achieving good coverage of the 4D domain despite the challenges of hand-held acquisition. We define coverage by bounding reprojection error between viewpoints, which accounts for all 4 dimensions of the light field. We use this criterion together with a recent Simultaneous Localization and Mapping technique to compute a coverage map on the space of viewpoints. We provide users with real-time feedback and direct them toward under-sampled parts of the light field. Our system is lightweight and has allowed us to capture hundreds of light fields. We further present a new rendering algorithm that is tailored to the unstructured yet dense data we capture. Our method can achieve piecewise-bicubic reconstruction using a triangulation of the captured viewpoints and subdivision rules applied to reconstruction weights.by Myers Abraham Davis (Abe Davis).S.M
Computer vision in the space of light rays: plenoptic videogeometry and polydioptric camera design
Most of the cameras used in computer vision, computer graphics, and image
processing applications are designed to capture images that are
similar to the images we see with our eyes. This enables an easy
interpretation of the visual information by a human observer.
Nowadays though, more and more processing of visual information is
done by computers. Thus, it is worth questioning if these human
inspired ``eyes'' are the optimal choice for processing visual
information using a machine.
In this thesis I will describe how one can study problems in computer
vision without reference to a specific camera model by studying the
geometry and statistics of the space of light rays that surrounds us.
The study of the geometry will allow us to determine all the possible
constraints that exist in the visual input and could be utilized if we
had a perfect sensor. Since no perfect sensor exists we use signal
processing techniques to examine how well the constraints between
different sets of light rays can be exploited given a specific camera
model. A camera is modeled as a spatio-temporal filter in the space of
light rays which lets us express the image formation process in a
function approximation framework. This framework then allows us to relate the geometry of the
imaging camera to the performance of the vision system
with regard to the given task. In this thesis I apply this framework
to problem of camera motion estimation. I show how by choosing the
right camera design we can solve for the camera motion using linear,
scene-independent constraints that allow for robust solutions. This is compared to motion estimation using conventional cameras. In
addition we show how we can extract spatio-temporal models from
multiple video sequences using multi-resolution subdivison surfaces
Temporally Coherent General Dynamic Scene Reconstruction
Existing techniques for dynamic scene reconstruction from multiple
wide-baseline cameras primarily focus on reconstruction in controlled
environments, with fixed calibrated cameras and strong prior constraints. This
paper introduces a general approach to obtain a 4D representation of complex
dynamic scenes from multi-view wide-baseline static or moving cameras without
prior knowledge of the scene structure, appearance, or illumination.
Contributions of the work are: An automatic method for initial coarse
reconstruction to initialize joint estimation; Sparse-to-dense temporal
correspondence integrated with joint multi-view segmentation and reconstruction
to introduce temporal coherence; and a general robust approach for joint
segmentation refinement and dense reconstruction of dynamic scenes by
introducing shape constraint. Comparison with state-of-the-art approaches on a
variety of complex indoor and outdoor scenes, demonstrates improved accuracy in
both multi-view segmentation and dense reconstruction. This paper demonstrates
unsupervised reconstruction of complete temporally coherent 4D scene models
with improved non-rigid object segmentation and shape reconstruction and its
application to free-viewpoint rendering and virtual reality.Comment: Submitted to IJCV 2019. arXiv admin note: substantial text overlap
with arXiv:1603.0338
Multi-View Dynamic Shape Refinement Using Local Temporal Integration
International audienceWe consider 4D shape reconstructions in multi-view environments and investigate how to exploit temporal redundancy for precision refinement. In addition to being beneficial to many dynamic multi-view scenarios this also enables larger scenes where such increased precision can compensate for the reduced spatial resolution per image frame. With precision and scalability in mind, we propose a symmetric (non-causal) local time-window geometric integration scheme over temporal sequences, where shape reconstructions are refined framewise by warping local and reliable geometric regions of neighboring frames to them. This is in contrast to recent comparable approaches targeting a different context with more compact scenes and real-time applications. These usually use a single dense volumetric update space or geometric template, which they causally track and update globally frame by frame, with limitations in scalability for larger scenes and in topology and precision with a template based strategy. Our templateless and local approach is a first step towards temporal shape super-resolution. We show that it improves reconstruction accuracy by considering multiple frames. To this purpose, and in addition to real data examples, we introduce a multi-camera synthetic dataset that provides ground-truth data for mid-scale dynamic scenes
Representations for Cognitive Vision : a Review of Appearance-Based, Spatio-Temporal, and Graph-Based Approaches
The emerging discipline of cognitive vision requires a proper representation of visual information including spatial and temporal relationships, scenes, events, semantics and context. This review article summarizes existing representational schemes in computer vision which might be useful for cognitive vision, a and discusses promising future research directions. The various approaches are categorized according to appearance-based, spatio-temporal, and graph-based representations for cognitive vision. While the representation of objects has been covered extensively in computer vision research, both from a reconstruction as well as from a recognition point of view, cognitive vision will also require new ideas how to represent scenes. We introduce new concepts for scene representations and discuss how these might be efficiently implemented in future cognitive vision systems
- …