23 research outputs found

    Virtual View Generation with a Hybrid Camera Array

    Get PDF
    Virtual view synthesis from an array of cameras has been an essential element of three-dimensional video broadcasting/conferencing. In this paper, we propose a scheme based on a hybrid camera array consisting of four regular video cameras and one time-of-flight depth camera. During rendering, we use the depth image from the depth camera as initialization, and compute a view-dependent scene geometry using constrained plane sweeping from the regular cameras. View-dependent texture mapping is then deployed to render the scene at the desired virtual viewpoint. Experimental results show that the addition of the time-of-flight depth camera greatly improves the rendering quality compared with an array of regular cameras with similar sparsity. In the application of 3D video boardcasting/conferencing, our hybrid camera system demonstrates great potential in reducing the amount of data for compression/streaming while maintaining high rendering quality

    Survey of image-based representations and compression techniques

    Get PDF
    In this paper, we survey the techniques for image-based rendering (IBR) and for compressing image-based representations. Unlike traditional three-dimensional (3-D) computer graphics, in which 3-D geometry of the scene is known, IBR techniques render novel views directly from input images. IBR techniques can be classified into three categories according to how much geometric information is used: rendering without geometry, rendering with implicit geometry (i.e., correspondence), and rendering with explicit geometry (either with approximate or accurate geometry). We discuss the characteristics of these categories and their representative techniques. IBR techniques demonstrate a surprising diverse range in their extent of use of images and geometry in representing 3-D scenes. We explore the issues in trading off the use of images and geometry by revisiting plenoptic-sampling analysis and the notions of view dependency and geometric proxies. Finally, we highlight compression techniques specifically designed for image-based representations. Such compression techniques are important in making IBR techniques practical.published_or_final_versio

    Surface light field from video acquired in uncontrolled settings

    Full text link

    Design and analysis of a two-dimensional camera array

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005.Includes bibliographical references (p. 153-158).I present the design and analysis of a two-dimensional camera array for virtual studio applications. It is possible to substitute conventional cameras and motion control devices with a real-time, light field camera array. I discuss a variety of camera architectures and describe a prototype system based on the "finite-viewpoints" design that allows multiple viewers to navigate virtual cameras in a dynamically changing light field captured in real time. The light field camera consists of 64 commodity video cameras connected to off-the-shelf computers. I employ a distributed rendering algorithm that overcomes the data bandwidth problems inherent in capturing light fields by selectively transmitting only those portions of the video streams that contribute to the desired virtual view. I also quantify the capabilities of a virtual camera rendered from a camera array in terms of the range of motion, range of rotation, and effective resolution. I compare these results to other configurations. From this analysis I provide a method for camera array designers to select and configure cameras to meet desired specifications. I demonstrate the system and the conclusions of the analysis with a number of examples that exploit dynamic light fields.by Jason Chieh-Sheng Yang.Ph.D

    A depth-cueing scheme based on linear transformations in tristimulus space

    Get PDF
    We propose a generic and flexible depth-cueing scheme which subsumes many well-known and new color-based depth-cueing approaches. In particular, it includes standard intensity depth-cueing and rather neglected pure saturation depth-cueing. A couple of new combinations and variations of depth cues are presented. Their usefulness is demonstrated in many different fields of application, reaching from non-photorealistic rendering to information visualization. In addition to cues based on a geometric concept of depth, an abstract visualization approach in the form of semantic depth-cueing is proposed. Our depth-cueing scheme is based on linear transformations in the 3D tristimulus space of colors and on weighted sums of colors. Since all of the required operations are supported by contemporary consumer graphics hardware, the depth-cueing scheme can be implemented without performance cutbacks. Therefore, any real-time rendering application can be enriched by sophisticated depth-cueing

    Fifth Biennial Report : June 1999 - August 2001

    No full text

    Multi-camera reconstruction and rendering for free-viewpoint video

    Get PDF
    While virtual environments in interactive entertainment become more and more lifelike and sophisticated, traditional media like television and video have not yet embraced the new possibilities provided by the rapidly advancing processing power. In particular, they remain as non-interactive as ever, and do not allow the viewer to change the camera perspective to his liking. The goal of this work is to advance in this direction, and provide essential ingredients for a free-viewpoint video system, where the viewpoint can be chosen interactively during playback. Knowledge of scene geometry is required to synthesize novel views. Therefore, we describe 3D reconstruction methods for two distinct kinds of camera setups. The first one is depth reconstruction for camera arrays with parallel optical axes, the second one surface reconstruction, in the case that the cameras are distributed around the scene. Another vital part of a 3D video system is the interactive rendering from different viewpoints, which has to perform in real-time. We cover this topic in the last part of this thesis.Während die virtuellen Welten in interaktiven Unterhaltungsmedien immer realitätsnäher werden, machen traditionellere Medien wie Fernsehen und Video von den neuen Möglichkeiten der rasant wachsenden Rechenkapazität bisher kaum Gebrauch. Insbesondere mangelt es ihnen immer noch an Interaktivität, und sie erlauben dem Konsumenten nicht, elementare Parameter wie zum Beispiel die Kameraperspektive seinen Wünschen anzupassen. Ziel dieser Arbeit ist es, die Entwicklung in diese Richtung voranzubringen und essentielle Bausteine für ein Videosystem bereitzustellen, bei dem der Blickpunkt während der Wiedergabe jederzeit völlig frei gewählt werden kann. Um neue Ansichten synthetisieren zu können, ist zunächst Kenntnis von der 3D Geometrie der Szene notwendig. Wir entwickeln daher Rekonstruktionsalgorithmen für zwei verschiedene Anordnungen von Kameras. Falls die Kameras eng beieinanderliegen und parallele optische Achsen haben, können lediglich Tiefenkarten geschätzt werden. Sind die Kameras jedoch im einer Halbkugel um die Szene herum montiert, so rekonstruieren wir sogar echte Oberflächengeometrie. Ein weiterer wichtiger Aspekt ist die interaktive Darstellung der Szene aus neuen Blickwinkeln, die wir im letzten Teil der Arbeit in Angriff nehmen

    Surface Appearance Estimation from Video Sequences

    Get PDF
    The realistic virtual reproduction of real world objects using Computer Graphics techniques requires the accurate acquisition and reconstruction of both 3D geometry and surface appearance. Unfortunately, in several application contexts, such as Cultural Heritage (CH), the reflectance acquisition can be very challenging due to the type of object to acquire and the digitization conditions. Although several methods have been proposed for the acquisition of object reflectance, some intrinsic limitations still make its acquisition a complex task for CH artworks: the use of specialized instruments (dome, special setup for camera and light source, etc.); the need of highly controlled acquisition environments, such as a dark room; the difficulty to extend to objects of arbitrary shape and size; the high level of expertise required to assess the quality of the acquisition. The Ph.D. thesis proposes novel solutions for the acquisition and the estimation of the surface appearance in fixed and uncontrolled lighting conditions with several degree of approximations (from a perceived near diffuse color to a SVBRDF), taking advantage of the main features that differentiate a video sequences from an unordered photos collections: the temporal coherence; the data redundancy; the easy of the acquisition, which allows acquisition of many views of the object in a short time. Finally, Reflectance Transformation Imaging (RTI) is an example of widely used technology for the acquisition of the surface appearance in the CH field, even if limited to single view Reflectance Fields of nearly flat objects. In this context, the thesis addresses also two important issues in RTI usage: how to provide better and more flexible virtual inspection capabilities with a set of operators that improve the perception of details, features and overall shape of the artwork; how to increase the possibility to disseminate this data and to support remote visual inspection of both scholar and ordinary public

    Image Based View Synthesis

    Get PDF
    This dissertation deals with the image-based approach to synthesize a virtual scene using sparse images or a video sequence without the use of 3D models. In our scenario, a real dynamic or static scene is captured by a set of un-calibrated images from different viewpoints. After automatically recovering the geometric transformations between these images, a series of photo-realistic virtual views can be rendered and a virtual environment covered by these several static cameras can be synthesized. This image-based approach has applications in object recognition, object transfer, video synthesis and video compression. In this dissertation, I have contributed to several sub-problems related to image based view synthesis. Before image-based view synthesis can be performed, images need to be segmented into individual objects. Assuming that a scene can approximately be described by multiple planar regions, I have developed a robust and novel approach to automatically extract a set of affine or projective transformations induced by these regions, correctly detect the occlusion pixels over multiple consecutive frames, and accurately segment the scene into several motion layers. First, a number of seed regions using correspondences in two frames are determined, and the seed regions are expanded and outliers are rejected employing the graph cuts method integrated with level set representation. Next, these initial regions are merged into several initial layers according to the motion similarity. Third, the occlusion order constraints on multiple frames are explored, which guarantee that the occlusion area increases with the temporal order in a short period and effectively maintains segmentation consistency over multiple consecutive frames. Then the correct layer segmentation is obtained by using a graph cuts algorithm, and the occlusions between the overlapping layers are explicitly determined. Several experimental results are demonstrated to show that our approach is effective and robust. Recovering the geometrical transformations among images of a scene is a prerequisite step for image-based view synthesis. I have developed a wide baseline matching algorithm to identify the correspondences between two un-calibrated images, and to further determine the geometric relationship between images, such as epipolar geometry or projective transformation. In our approach, a set of salient features, edge-corners, are detected to provide robust and consistent matching primitives. Then, based on the Singular Value Decomposition (SVD) of an affine matrix, we effectively quantize the search space into two independent subspaces for rotation angle and scaling factor, and then we use a two-stage affine matching algorithm to obtain robust matches between these two frames. The experimental results on a number of wide baseline images strongly demonstrate that our matching method outperforms the state-of-art algorithms even under the significant camera motion, illumination variation, occlusion, and self-similarity. Given the wide baseline matches among images I have developed a novel method for Dynamic view morphing. Dynamic view morphing deals with the scenes containing moving objects in presence of camera motion. The objects can be rigid or non-rigid, each of them can move in any orientation or direction. The proposed method can generate a series of continuous and physically accurate intermediate views from only two reference images without any knowledge about 3D. The procedure consists of three steps: segmentation, morphing and post-warping. Given a boundary connection constraint, the source and target scenes are segmented into several layers for morphing. Based on the decomposition of affine transformation between corresponding points, we uniquely determine a physically correct path for post-warping by the least distortion method. I have successfully generalized the dynamic scene synthesis problem from the simple scene with only rotation to the dynamic scene containing non-rigid objects. My method can handle dynamic rigid or non-rigid objects, including complicated objects such as humans. Finally, I have also developed a novel algorithm for tri-view morphing. This is an efficient image-based method to navigate a scene based on only three wide-baseline un-calibrated images without the explicit use of a 3D model. After automatically recovering corresponding points between each pair of images using our wide baseline matching method, an accurate trifocal plane is extracted from the trifocal tensor implied in these three images. Next, employing a trinocular-stereo algorithm and barycentric blending technique, we generate an arbitrary novel view to navigate the scene in a 2D space. Furthermore, after self-calibration of the cameras, a 3D model can also be correctly augmented into this virtual environment synthesized by the tri-view morphing algorithm. We have applied our view morphing framework to several interesting applications: 4D video synthesis, automatic target recognition, multi-view morphing

    Méthodes de rendu à base de vidéos et applications à la réalité Virtuelle

    Get PDF
    Given a set images of the same scene, the goal of video-based rendering methods is to compute new views of this scene from new viewpoints. The user of this system controls the virtual camera's movement through the scene. Nevertheless, the virtual images are computed from static cameras. A first approach is based on a reconstruction of the scene and can provide accurate models but often requires lengthy computation before visualization. Other methods try to achieve real-time rendering. Our main contribution to video-base rendering concerns the plane sweep method which belongs to the latter family. The plane sweep method divides space in parallel planes. Each point of each plane is processed independently in order to know if it lies on the surface of an object of the scene. These informations are used to compute a new view of the scene from a new viewpoint. This method is well suited to an implementation using graphic hardware and thus to reach realtime rendering. Our main contribution to this method concerns the way to consider whether a point of a plane lies on the surface of an object of the scene. We first propose a new scoring method increasing the visual quality of the new images. Compared with previous approaches, this method implies fewer constraints on the position of the virtaul camera, i.e. this camera does not need to lie between the input camera's area. We also present an adaptation of the plane sweep algorithm that handles partial occlusions. According to video-based rendering practical applications in virtual reality, we propose an improvement of the plane sweep method dealing with stereoscopic images computation that provides visualization of the virtual scene in relief. Our enhancement provides the second view with only low additional computation time whereas most of the others techniques require to render the scene twice. This improvement is based on a sharing of the informations common to the two stereoscopic views. Finally, we propose a method that removes pseudoscopic movements in a virtual reality application. These pseudoscopic movements appear when the observer moves in front of the stereoscopic screen. Then the scene roportions seem to be distorted and the observer sees the objects of the scene moving in an anormal way. The method we propose is available either on a classical stereoscopic rendering method or on the Plane Seep algorithm. Every method we propose widely uses graphic harware through to shader programs and provides real-time rendering. These methods only require a standard computer, a video acquisition device and a powerful enough graphic card. There exists a lot of practicalapplications of the plane sweep method, especially in fields like virtual reality, video games, 3d television or security.Etant donné un ensemble de caméras filmant une même scène, le rendu à base de vidéos consiste à générer de nouvelles images de cette scène à partir de nouveaux points de vue. L'utilisateur a ainsi l'impression de pouvoir déplacer une caméra virtuelle dans la scène alors qu'en réalité, toutes les caméras sont fixes. Certaines méthodes de rendu à base de vidéos coûteuses en temps de calcul se basent sur une reconstruction 3d de la scène et produisent des images de très bonne qualité. D'autres méthodes s'orientent plutôt vers le rendu temps réel. C'est dans cette dernière catégorie que s'inscrit la méthode de Plane Sweep sur laquelle porte la majeure partie de nos travaux. Le principe de la méthode des Plane Sweep consiste à discrétiser la scène en plans parallèles et à traiter séparément chaque point de ces plans afin de déterminer s'ils se trouvent ou non sur la surface d'un objet de la scène. Les résultats obtenus permettent de générer une nouvelle image de la scène à partir d'un nouveau point de vue. Cette méthode est particulièrement bien adaptée à une utilisation optimale des ressources de la carte graphique ce qui explique qu'elle permette d'effectuer du rendu en temps réel. Notre principale contribution à cette méthode concerne la façon d'estimer si un point d'un plan représente la surface d'un objet. Nous proposons d'une part un nouveau mode de calcul permettant d'améliorer le résultat visuel tout en rendant la navigation de la caméra virtuelle plus souple. D'autre part, nous présentons une adaptation de la méthode des Plane Sweep permettant de gérer les occlusions partielles. Compte tenu des applications du rendu à base de vidéos en réalité virtuelle, nous proposons une amélioration des Plane Sweep appliquée à la réalité virtuelle avec notamment la création de paires d'images stéréoscopiques permettant de visualiser en relief la scène reconstruite. Notre amélioration consiste à calculer la seconde vue à moindre coût alors qu'une majorité des méthodes concurrentes sont contraintes d'effectuer deux rendus indépendants. Cette amélioration est basée sur un partage des données communes aux deux vues stéréoscopiques. Enfin, dans le cadre de l'utilisation des Plane Sweep en réalité virtuelle, nous présentons une méthode permettant de supprimer les mouvements pseudoscopiques. Ces mouvements pseudoscopiques apparaissent lorsque l'observateur se déplace devant une image stéréoscopique, il ressent alors une distorsion des proportions de la scène virtuelle et voit les objets se déplacer de façon anormale. La méthode de correction que nous proposons est applicable d'une part à des méthodes classiques de rendu d'images de synthèse et d'autre part à la méthode des Plane Sweep. Toutes les méthodes que nous présentons utilisent largement les possibilités du processeur de la carte graphique à l'aide des shader programs et génèrent toutes des images en temps réel. Seuls un ordinateur grand public, un dispositif d'acquisition vidéo et une bonne carte graphique sont suffisants pour les faire fonctionner. Les applications des Plane Sweep sont nombreuses, en particulier dans les domaines de la réalité virtuelle, du jeu vidéo, de la télévision 3d ou de la sécurité
    corecore