569 research outputs found

    Automatic 3DS Conversion of Historical Aerial Photographs

    Get PDF
    In this paper we present a method for the generation of 3D stereo (3DS) pairs from sequences of historical aerial photographs. The goal of our work is to provide a stereoscopic display when the existing exposures are in a monocular sequence. Each input image is processed using its neighbours and a synthetic image is rendered, which, together with the original one, form a stereo pair. Promising results on real images taken from a historical photo archive are shown, that corroborate the viability of generating 3DS data from monocular footage

    Image Based View Synthesis

    Get PDF
    This dissertation deals with the image-based approach to synthesize a virtual scene using sparse images or a video sequence without the use of 3D models. In our scenario, a real dynamic or static scene is captured by a set of un-calibrated images from different viewpoints. After automatically recovering the geometric transformations between these images, a series of photo-realistic virtual views can be rendered and a virtual environment covered by these several static cameras can be synthesized. This image-based approach has applications in object recognition, object transfer, video synthesis and video compression. In this dissertation, I have contributed to several sub-problems related to image based view synthesis. Before image-based view synthesis can be performed, images need to be segmented into individual objects. Assuming that a scene can approximately be described by multiple planar regions, I have developed a robust and novel approach to automatically extract a set of affine or projective transformations induced by these regions, correctly detect the occlusion pixels over multiple consecutive frames, and accurately segment the scene into several motion layers. First, a number of seed regions using correspondences in two frames are determined, and the seed regions are expanded and outliers are rejected employing the graph cuts method integrated with level set representation. Next, these initial regions are merged into several initial layers according to the motion similarity. Third, the occlusion order constraints on multiple frames are explored, which guarantee that the occlusion area increases with the temporal order in a short period and effectively maintains segmentation consistency over multiple consecutive frames. Then the correct layer segmentation is obtained by using a graph cuts algorithm, and the occlusions between the overlapping layers are explicitly determined. Several experimental results are demonstrated to show that our approach is effective and robust. Recovering the geometrical transformations among images of a scene is a prerequisite step for image-based view synthesis. I have developed a wide baseline matching algorithm to identify the correspondences between two un-calibrated images, and to further determine the geometric relationship between images, such as epipolar geometry or projective transformation. In our approach, a set of salient features, edge-corners, are detected to provide robust and consistent matching primitives. Then, based on the Singular Value Decomposition (SVD) of an affine matrix, we effectively quantize the search space into two independent subspaces for rotation angle and scaling factor, and then we use a two-stage affine matching algorithm to obtain robust matches between these two frames. The experimental results on a number of wide baseline images strongly demonstrate that our matching method outperforms the state-of-art algorithms even under the significant camera motion, illumination variation, occlusion, and self-similarity. Given the wide baseline matches among images I have developed a novel method for Dynamic view morphing. Dynamic view morphing deals with the scenes containing moving objects in presence of camera motion. The objects can be rigid or non-rigid, each of them can move in any orientation or direction. The proposed method can generate a series of continuous and physically accurate intermediate views from only two reference images without any knowledge about 3D. The procedure consists of three steps: segmentation, morphing and post-warping. Given a boundary connection constraint, the source and target scenes are segmented into several layers for morphing. Based on the decomposition of affine transformation between corresponding points, we uniquely determine a physically correct path for post-warping by the least distortion method. I have successfully generalized the dynamic scene synthesis problem from the simple scene with only rotation to the dynamic scene containing non-rigid objects. My method can handle dynamic rigid or non-rigid objects, including complicated objects such as humans. Finally, I have also developed a novel algorithm for tri-view morphing. This is an efficient image-based method to navigate a scene based on only three wide-baseline un-calibrated images without the explicit use of a 3D model. After automatically recovering corresponding points between each pair of images using our wide baseline matching method, an accurate trifocal plane is extracted from the trifocal tensor implied in these three images. Next, employing a trinocular-stereo algorithm and barycentric blending technique, we generate an arbitrary novel view to navigate the scene in a 2D space. Furthermore, after self-calibration of the cameras, a 3D model can also be correctly augmented into this virtual environment synthesized by the tri-view morphing algorithm. We have applied our view morphing framework to several interesting applications: 4D video synthesis, automatic target recognition, multi-view morphing

    View generation for three-dimensional scenes from video sequences

    Full text link

    Euclidean reconstruction and reprojection up to subgroups

    Get PDF
    The necessaryand sufficient conditionsfor being able to estimatescene structure, motion and camera calibration from a sequence of images are very rarely satisfied in practice. What exactly can be estimated in sequences of practical importance, when such conditions are not satisfied? In this paper we give a complete answer to this question. For every camera motion that fails to meet the conditions, we give explicit formulas for the ambiguities in the reconstructed scene, motion and calibration. Such a characterization is crucial both for designing robust estimation algorithms (that do not try to recover parameters that cannot be recovered), and for generating novel views of the scene by controlling the vantage point. To this end, we characterizeexplicitly all the vantage points that give rise to a valid Euclidean reprojection regardless of the ambiguity in the reconstruction. We also characterize vantage points that generate views that are altogether invariant to the ambiguity. All the results are presented using simple notation that involves no tensors nor complex projective geometry, and should be accessible with basic background in linear algebra. 1

    Multiple View Geometry For Video Analysis And Post-production

    Get PDF
    Multiple view geometry is the foundation of an important class of computer vision techniques for simultaneous recovery of camera motion and scene structure from a set of images. There are numerous important applications in this area. Examples include video post-production, scene reconstruction, registration, surveillance, tracking, and segmentation. In video post-production, which is the topic being addressed in this dissertation, computer analysis of the motion of the camera can replace the currently used manual methods for correctly aligning an artificially inserted object in a scene. However, existing single view methods typically require multiple vanishing points, and therefore would fail when only one vanishing point is available. In addition, current multiple view techniques, making use of either epipolar geometry or trifocal tensor, do not exploit fully the properties of constant or known camera motion. Finally, there does not exist a general solution to the problem of synchronization of N video sequences of distinct general scenes captured by cameras undergoing similar ego-motions, which is the necessary step for video post-production among different input videos. This dissertation proposes several advancements that overcome these limitations. These advancements are used to develop an efficient framework for video analysis and post-production in multiple cameras. In the first part of the dissertation, the novel inter-image constraints are introduced that are particularly useful for scenes where minimal information is available. This result extends the current state-of-the-art in single view geometry techniques to situations where only one vanishing point is available. The property of constant or known camera motion is also described in this dissertation for applications such as calibration of a network of cameras in video surveillance systems, and Euclidean reconstruction from turn-table image sequences in the presence of zoom and focus. We then propose a new framework for the estimation and alignment of camera motions, including both simple (panning, tracking and zooming) and complex (e.g. hand-held) camera motions. Accuracy of these results is demonstrated by applying our approach to video post-production applications such as video cut-and-paste and shadow synthesis. As realistic image-based rendering problems, these applications require extreme accuracy in the estimation of camera geometry, the position and the orientation of the light source, and the photometric properties of the resulting cast shadows. In each case, the theoretical results are fully supported and illustrated by both numerical simulations and thorough experimentation on real data

    Ort-Zeit Interpolation durch Bildbasiertes Morphen

    Get PDF
    Rendering convincing transitions between individual pictures is the main challenge in image-based rendering and keyframe animation as well as the prerequisite for many stunning visual effects. We present a perception-based method for automatic image interpolation, achieving psycho-visually plausible transitions between real-world images in real-time. Based on recent discoveries in perception research, we propose an optical flow-based warping refinement method and an adaptive non-linear image blending scheme to guarantee perceptional plausibility of the interpolated in-between images. Conventional, uncalibrated photographs suffice to convincingly interpolate across space, time, and between different objects, without the need to recover 3D scene geometry, actual motion, or camera calibration. Using off-the-shelf digital cameras, we demonstrate how to continuously navigate the viewpoint between camera positions and shutter release times, how to animate still pictures, create smooth camera motion paths, and how to convincingly morph between depictions of different objects
    corecore