Search CORE

12 research outputs found

Disparity map generation based on trapezoidal camera architecture for multiview video

Author: Audu AI
Sadka AH
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 31/12/2014
Field of study

Visual content acquisition is a strategic functional block of any visual system. Despite its wide possibilities, the arrangement of cameras for the acquisition of good quality visual content for use in multi-view video remains a huge challenge. This paper presents the mathematical description of trapezoidal camera architecture and relationships which facilitate the determination of camera position for visual content acquisition in multi-view video, and depth map generation. The strong point of Trapezoidal Camera Architecture is that it allows for adaptive camera topology by which points within the scene, especially the occluded ones can be optically and geometrically viewed from several different viewpoints either on the edge of the trapezoid or inside it. The concept of maximum independent set, trapezoid characteristics, and the fact that the positions of cameras (with the exception of few) differ in their vertical coordinate description could very well be used to address the issue of occlusion which continues to be a major problem in computer vision with regards to the generation of depth map

Crossref

Brunel University Research Archive

An object-based approach to plenoptic videos

Author: Chan SC
Gan ZF
Ng KT
Shum HY
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

This paper proposes an object-based approach to plenoptic videos, where the plenoptic video sequences are segmented into image-based rendering (IBR) objects each with its image sequence, depth map and other relevant information such as shape information. This allows desirable functionalities such as scalability of contents, error resilience, and interactivity with individual IBR objects to be supported. A portable capturing system consisting of two linear camera arrays, each hosting 6 JVC video cameras, was developed to verify the proposed approach. Rendering and compression results of real-world scenes demonstrate the usefulness and good quality of the proposed approach. © 2005 IEEE.published_or_final_versio

HKU Scholars Hub

The Plenoptic videos: Capturing, Rendering and Compression

Author: Chan KL
Chan SC
Gan ZF
Ng KT
Shum HY
Publication venue: IEEE.
Publication date: 01/01/2004
Field of study

This paper presents a system for capturing and rendering a dynamic image-based representation called the plenoptic videos. It is a simplified version of light fields for dynamic environment, where user viewpoints are constrained along the camera plane of a linear array of video cameras. The system consists of a camera array of 8 Sony CCX-Z11 CCD cameras and eight Pentium 41.8 GHz computers connected together through a 100 baseT LAN. Important issues such as multiple camera calibration, real-time compression, decompression and rendering are addressed. Experimental results demonstrated the usefulness of the proposed parallel processing based system in capturing and rendering high quality dynamic image-based representation using off-the-shelf equipment, and its potential applications in visualization and immersive television systems.published_or_final_versio

HKU Scholars Hub

Recommended from our members

3D TV: A Scalable System for Real-Time Acquisition, Transmission, and Autostereoscopic Display of Dynamic Scenes

Author: Matusik Wojciech
Pfister Hanspeter
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 15/02/2011
Field of study

Three-dimensional TV is expected to be the next revolution in the history of television. We implemented a 3D TV prototype system with real-time acquisition, transmission, and 3D display of dynamic scenes. We developed a distributed, scalable architecture to manage the high computation and bandwidth demands. Our system consists of an array of cameras, clusters of network-connected PCs, and a multi-projector 3D display. Multiple video streams are individually encoded and sent over a broadband network to the display. The 3D display shows high-resolution (1024 × 768) stereoscopic color images for multiple viewpoints without special glasses. We implemented systems with rear-projection and front-projection lenticular screens. In this paper, we provide a detailed overview of our 3D TV system, including an examination of design choices and tradeoffs. We present the calibration and image alignment procedures that are necessary to achieve good image quality. We present qualitative results and some early user feedback. We believe this is the first real-time end-to-end 3D TV system with enough views and resolution to provide a truly immersive 3D experience.Engineering and Applied Science

Harvard University - DASH

Image-based rendering and synthesis

Author: Chan SC
Ng KT
Shum HY
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

Multiview imaging (MVI) is currently the focus of some research as it has a wide range of applications and opens up research in other topics and applications, including virtual view synthesis for three-dimensional (3D) television (3DTV) and entertainment. However, a large amount of storage is needed by multiview systems and are difficult to construct. The concept behind allowing 3D scenes and objects to be visualized in a realistic way without full 3D model reconstruction is image-based rendering (IBR). Using images as the primary substrate, IBR has many potential applications including for video games, virtual travel and others. The technique creates new views of scenes which are reconstructed from a collection of densely sampled images or videos. The IBR concept has different classification such as knowing 3D models and the lighting conditions and be rendered using conventional graphic techniques. Another is lightfield or lumigraph rendering which depends on dense sampling with no or very little geometry for rendering without recovering the exact 3D-models.published_or_final_versio

HKU Scholars Hub

Virtual View Generation with a Hybrid Camera Array

Author: Cai Qin
Tola Engin
Zhang Cha
Zhang Zhengyou
Publication venue
Publication date: 15/06/2009
Field of study

Virtual view synthesis from an array of cameras has been an essential element of three-dimensional video broadcasting/conferencing. In this paper, we propose a scheme based on a hybrid camera array consisting of four regular video cameras and one time-of-flight depth camera. During rendering, we use the depth image from the depth camera as initialization, and compute a view-dependent scene geometry using constrained plane sweeping from the regular cameras. View-dependent texture mapping is then deployed to render the scene at the desired virtual viewpoint. Experimental results show that the addition of the time-of-flight depth camera greatly improves the rendering quality compared with an array of regular cameras with similar sparsity. In the application of 3D video boardcasting/conferencing, our hybrid camera system demonstrates great potential in reducing the amount of data for compression/streaming while maintaining high rendering quality

Infoscience - École polytechnique fédérale de Lausanne

Scalable multi-view stereo camera array for real world real-time image capture and three-dimensional displays

Author: Hill Samuel L. (Samuel Lincoln), 1978-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2004
Field of study

Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2004.Includes bibliographical references (leaves 71-75).The number of three-dimensional displays available is escalating and yet the capturing devices for multiple view content are focused on either single camera precision rigs that are limited to stationary objects or the use of synthetically created animations. In this work we will use the existence of inexpensive digital CMOS cameras to explore a multi- image capture paradigm and the gathering of real world real-time data of active and static scenes. The capturing system can be developed and employed for a wide range of applications such as portrait-based images for multi-view facial recognition systems, hypostereo surgical training systems, and stereo surveillance by unmanned aerial vehicles. The system will be adaptable to capturing the correct stereo views based on the environmental scene and the desired three-dimensional display. Several issues explored by the system will include image calibration, geometric correction, the possibility of object tracking, and transfer of the array technology into other image capturing systems. These features provide the user more freedom to interact with their specific 3-D content while allowing the computer to take on the difficult role of stereoscopic cinematographer.Samuel L. Hill.S.M

DSpace@MIT

Design and analysis of a two-dimensional camera array

Author: Yang Jason C. (Jason Chieh-Sheng), 1977-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2005
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005.Includes bibliographical references (p. 153-158).I present the design and analysis of a two-dimensional camera array for virtual studio applications. It is possible to substitute conventional cameras and motion control devices with a real-time, light field camera array. I discuss a variety of camera architectures and describe a prototype system based on the "finite-viewpoints" design that allows multiple viewers to navigate virtual cameras in a dynamically changing light field captured in real time. The light field camera consists of 64 commodity video cameras connected to off-the-shelf computers. I employ a distributed rendering algorithm that overcomes the data bandwidth problems inherent in capturing light fields by selectively transmitting only those portions of the video streams that contribute to the desired virtual view. I also quantify the capabilities of a virtual camera rendered from a camera array in terms of the range of motion, range of rotation, and effective resolution. I compare these results to other configurations. From this analysis I provide a method for camera array designers to select and configure cameras to meet desired specifications. I demonstrate the system and the conclusions of the analysis with a number of examples that exploit dynamic light fields.by Jason Chieh-Sheng Yang.Ph.D

DSpace@MIT

Rendering from unstructured collections of images

Author: Buehler Christopher James, 1974-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2002
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.Includes bibliographical references (p. 157-163).Computer graphics researchers recently have turned to image-based rendering to achieve the goal of photorealistic graphics. Instead of constructing a scene with millions of polygons, the scene is represented by a collection of photographs along with a greatly simplified geometric model. This simple representation allows traditional light transport simulations to be replaced with basic image-processing routines that combine multiple images together to produce never-before-seen images from new vantage points. This thesis presents a new image-based rendering algorithm called unstructured lumigraph rendering (ULR). ULR is an image-based rendering algorithm that is specifically designed to work with unstructured (i.e., irregularly arranged) collections of images. The algorithm is unique in that it is capable of using any amount of geometric or image information that is available about a scene. Specifically, the research in this thesis makes the following contributions: * An enumeration of image-based rendering properties that an ideal algorithm should attempt to satisfy. An algorithm that satisfies these properties should work as well as possible with any configuration of input images or geometric knowledge. * An optimal formulation of the basic image-based rendering problem, the solution to which is designed to satisfy the aforementioned properties. * The unstructured lumigraph rendering algorithm, which is an efficient approximation to the optimal image-based rendering solution. * A non-metric ULR algorithm, which generalizes the basic ULR algorithm to work with uncalibrated images. * A time-dependent ULR algorithm, which generalizes the basic ULR algorithm to work with time-dependent data.by Christopher James Buehler.Ph.D

DSpace@MIT

Algorithmen zur Korrespondenzschätzung und Bildinterpolation für die photorealistische Bildsynthese

Author: Linz Christian
Publication venue
Publication date: 29/04/2011
Field of study

Free-viewpoint video is a new form of visual medium that has received considerable attention in the last 10 years. Most systems reconstruct the geometry of the scene, thus restricting themselves to synchronized multi-view footage and Lambertian scenes. In this thesis we follow a different approach and describe contributions to a purely image-based end-to-end system operating on sparse, unsynchronized multi-view footage. In particular, we focus on dense correspondence estimation and synthesis of in-between views. In contrast to previous approaches, our correspondence estimation is specifically tailored to the needs of image interpolation; our multi-image interpolation technique advances the state-of-the-art by disposing the conventional blending step. Both algorithms are put to work in an image-based free-viewpoint video system and we demonstrate their applicability to space-time visual effects production as well as to stereoscopic content creation.3D-Video mit Blickpunktnavigation ist eine neues digitales Medium welchem die Forschung in den letzten 10 Jahren viel Aufmerksamkeit gewidmet hat. Die meisten Verfahren rekonstruieren dabei die Szenengeometrie und schränken sich somit auf Lambertsche Szenen und synchron aufgenommene Eingabedaten ein. In dieser Dissertation beschreiben wir Beiträge zu einem rein bild-basierten System welches auf unsynchronisierten Eingabevideos arbeitet. Unser Fokus liegt dabei auf der Schätzung dichter Korrespondenzkarten und auf der Synthese von Zwischenbildern. Im Gegensatz zu bisherigen Verfahren ist unser Ansatz der Korrespondenzschätzung auf die Bedürfnisse der Bilderinterpolation ausgerichtet; unsere Zwischenbildsynthese verzichtet auf das Überblenden der Eingabebilder zu Gunsten der Lösung eines Labelingproblems. Das resultierende System eignet sich sowohl zur Produktion räumlich-zeitlicher Spezialeffekte als auch zur Erzeugung stereoskopischer Videosequenzen

Digitale Bibliothek Braunschweig