87 research outputs found
Computational Re-Photography
Rephotographers aim to recapture an existing photograph from the same viewpoint. A historical photograph paired with a well-aligned modern rephotograph can serve as a remarkable visualization of the passage of time. However, the task of rephotography is tedious and often imprecise, because reproducing the viewpoint of the original photograph is challenging. The rephotographer must disambiguate between the six degrees of freedom of 3D translation and rotation, and the confounding similarity between the effects of camera zoom and dolly. We present a real-time estimation and visualization technique for rephotography that helps users reach a desired viewpoint during capture. The input to our technique is a reference image taken from the desired viewpoint. The user moves through the scene with a camera and follows our visualization to reach the desired viewpoint. We employ computer vision techniques to compute the relative viewpoint difference. We guide 3D movement using two 2D arrows. We demonstrate the success of our technique by rephotographing historical images and conducting user studies
Recognizing Image Style
The style of an image plays a significant role in how it is viewed, but style
has received little attention in computer vision research. We describe an
approach to predicting style of images, and perform a thorough evaluation of
different image features for these tasks. We find that features learned in a
multi-layer network generally perform best -- even when trained with object
class (not style) labels. Our large-scale learning methods results in the best
published performance on an existing dataset of aesthetic ratings and
photographic style annotations. We present two novel datasets: 80K Flickr
photographs annotated with 20 curated style labels, and 85K paintings annotated
with 25 style/genre labels. Our approach shows excellent classification
performance on both datasets. We use the learned classifiers to extend
traditional tag-based image search to consider stylistic constraints, and
demonstrate cross-dataset understanding of style
Content-Preserving Warps for 3D Video Stabilization
We describe a technique that transforms a video from a hand-held video camera so that it appears as if it were taken with a directed camera motion. Our method adjusts the video to appear as if it were taken from nearby viewpoints, allowing 3D camera movements to be simulated. By aiming only for perceptual plausibility, rather than accurate reconstruction, we are able to develop algorithms that can effectively recreate dynamic scenes from a single source video. Our technique first recovers the original 3D camera motion and a sparse set of 3D, static scene points using an off-the-shelf structure-frommotion system. Then, a desired camera path is computed either automatically (e.g., by fitting a linear or quadratic path) or interactively. Finally, our technique performs a least-squares optimization that computes a spatially-varying warp from each input video frame into an output frame. The warp is computed to both follow the sparse displacements suggested by the recovered 3D structure, and avoid deforming the content in the video frame. Our experiments on stabilizing challenging videos of dynamic scenes demonstrate the effectiveness of our technique
Volumetric surface sculpting
Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1999.Includes bibliographical references (leaves 77-81).by Aseem Agarwala.M.Eng
Deep Homography Estimation for Dynamic Scenes
Homography estimation is an important step in many computer vision problems.
Recently, deep neural network methods have shown to be favorable for this
problem when compared to traditional methods. However, these new methods do not
consider dynamic content in input images. They train neural networks with only
image pairs that can be perfectly aligned using homographies. This paper
investigates and discusses how to design and train a deep neural network that
handles dynamic scenes. We first collect a large video dataset with dynamic
content. We then develop a multi-scale neural network and show that when
properly trained using our new dataset, this neural network can already handle
dynamic scenes to some extent. To estimate a homography of a dynamic scene in a
more principled way, we need to identify the dynamic content. Since dynamic
content detection and homography estimation are two tightly coupled tasks, we
follow the multi-task learning principles and augment our multi-scale network
such that it jointly estimates the dynamics masks and homographies. Our
experiments show that our method can robustly estimate homography for
challenging scenarios with dynamic scenes, blur artifacts, or lack of textures.Comment: CVPR 2020, https://github.com/lcmhoang/hmg-dynamic
- …
