684 research outputs found

    A Fast and Robust Extrinsic Calibration for RGB-D Camera Networks

    Get PDF
    From object tracking to 3D reconstruction, RGB-Depth (RGB-D) camera networks play an increasingly important role in many vision and graphics applications. Practical applications often use sparsely-placed cameras to maximize visibility, while using as few cameras as possible to minimize cost. In general, it is challenging to calibrate sparse camera networks due to the lack of shared scene features across different camera views. In this paper, we propose a novel algorithm that can accurately and rapidly calibrate the geometric relationships across an arbitrary number of RGB-D cameras on a network. Our work has a number of novel features. First, to cope with the wide separation between different cameras, we establish view correspondences by using a spherical calibration object. We show that this approach outperforms other techniques based on planar calibration objects. Second, instead of modeling camera extrinsic calibration using rigid transformation, which is optimal only for pinhole cameras, we systematically test different view transformation functions including rigid transformation, polynomial transformation and manifold regression to determine the most robust mapping that generalizes well to unseen data. Third, we reformulate the celebrated bundle adjustment procedure to minimize the global 3D reprojection error so as to fine-tune the initial estimates. Finally, our scalable client-server architecture is computationally efficient: the calibration of a five-camera system, including data capture, can be done in minutes using only commodity PCs. Our proposed framework is compared with other state-of-the-arts systems using both quantitative measurements and visual alignment results of the merged point clouds

    Per-Pixel Calibration for RGB-Depth Natural 3D Reconstruction on GPU

    Get PDF
    Ever since the Kinect brought low-cost depth cameras into consumer market, great interest has been invigorated into Red-Green-Blue-Depth (RGBD) sensors. Without calibration, a RGBD camera’s horizontal and vertical field of view (FoV) could help generate 3D reconstruction in camera space naturally on graphics processing unit (GPU), which however is badly deformed by the lens distortions and imperfect depth resolution (depth distortion). The camera’s calibration based on a pinhole-camera model and a high-order distortion removal model requires a lot of calculations in the fragment shader. In order to get rid of both the lens distortion and the depth distortion while still be able to do simple calculations in the GPU fragment shader, a novel per-pixel calibration method with look-up table based 3D reconstruction in real-time is proposed, using a rail calibration system. This rail calibration system offers possibilities of collecting infinite calibrating points of dense distributions that can cover all pixels in a sensor, such that not only lens distortions, but depth distortion can also be handled by a per-pixel D to ZW mapping. Instead of utilizing the traditional pinhole camera model, two polynomial mapping models are employed. One is a two-dimensional high-order polynomial mapping from R/C to XW=YW respectively, which handles lens distortions; and the other one is a per-pixel linear mapping from D to ZW, which can handle depth distortion. With only six parameters and three linear equations in the fragment shader, the undistorted 3D world coordinates (XW, YW, ZW) for every single pixel could be generated in real-time. The per-pixel calibration method could be applied universally on any RGBD cameras. With the alignment of RGB values using a pinhole camera matrix, it could even work on a combination of a random Depth sensor and a random RGB sensor

    REAL-TIME CAPTURE AND RENDERING OF PHYSICAL SCENE WITH AN EFFICIENTLY CALIBRATED RGB-D CAMERA NETWORK

    Get PDF
    From object tracking to 3D reconstruction, RGB-Depth (RGB-D) camera networks play an increasingly important role in many vision and graphics applications. With the recent explosive growth of Augmented Reality (AR) and Virtual Reality (VR) platforms, utilizing camera RGB-D camera networks to capture and render dynamic physical space can enhance immersive experiences for users. To maximize coverage and minimize costs, practical applications often use a small number of RGB-D cameras and sparsely place them around the environment for data capturing. While sparse color camera networks have been studied for decades, the problems of extrinsic calibration of and rendering with sparse RGB-D camera networks are less well understood. Extrinsic calibration is difficult because of inappropriate RGB-D camera models and lack of shared scene features. Due to the significant camera noise and sparse coverage of the scene, the quality of rendering 3D point clouds is much lower compared with synthetic models. Adding virtual objects whose rendering depend on the physical environment such as those with reflective surfaces further complicate the rendering pipeline. In this dissertation, I propose novel solutions to tackle these challenges faced by RGB-D camera systems. First, I propose a novel extrinsic calibration algorithm that can accurately and rapidly calibrate the geometric relationships across an arbitrary number of RGB-D cameras on a network. Second, I propose a novel rendering pipeline that can capture and render, in real-time, dynamic scenes in the presence of arbitrary-shaped reflective virtual objects. Third, I have demonstrated a teleportation application that uses the proposed system to merge two geographically separated 3D captured scenes into the same reconstructed environment. To provide a fast and robust calibration for a sparse RGB-D camera network, first, the correspondences between different camera views are established by using a spherical calibration object. We show that this approach outperforms other techniques based on planar calibration objects. Second, instead of modeling camera extrinsic using rigid transformation that is optimal only for pinhole cameras, different view transformation functions including rigid transformation, polynomial transformation, and manifold regression are systematically tested to determine the most robust mapping that generalizes well to unseen data. Third, the celebrated bundle adjustment procedure is reformulated to minimize the global 3D projection error so as to fine-tune the initial estimates. To achieve a realistic mirror rendering, a robust eye detector is used to identify the viewer\u27s 3D location and render the reflective scene accordingly. The limited field of view obtained from a single camera is overcome by our calibrated RGB-D camera network system that is scalable to capture an arbitrarily large environment. The rendering is accomplished by raytracing light rays from the viewpoint to the scene reflected by the virtual curved surface. To the best of our knowledge, the proposed system is the first to render reflective dynamic scenes from real 3D data in large environments. Our scalable client-server architecture is computationally efficient - the calibration of a camera network system, including data capture, can be done in minutes using only commodity PCs

    Application of augmented reality and robotic technology in broadcasting: A survey

    Get PDF
    As an innovation technique, Augmented Reality (AR) has been gradually deployed in the broadcast, videography and cinematography industries. Virtual graphics generated by AR are dynamic and overlap on the surface of the environment so that the original appearance can be greatly enhanced in comparison with traditional broadcasting. In addition, AR enables broadcasters to interact with augmented virtual 3D models on a broadcasting scene in order to enhance the performance of broadcasting. Recently, advanced robotic technologies have been deployed in a camera shooting system to create a robotic cameraman so that the performance of AR broadcasting could be further improved, which is highlighted in the paper

    3D Object Reconstruction from Imperfect Depth Data Using Extended YOLOv3 Network

    Get PDF
    State-of-the-art intelligent versatile applications provoke the usage of full 3D, depth-based streams, especially in the scenarios of intelligent remote control and communications, where virtual and augmented reality will soon become outdated and are forecasted to be replaced by point cloud streams providing explorable 3D environments of communication and industrial data. One of the most novel approaches employed in modern object reconstruction methods is to use a priori knowledge of the objects that are being reconstructed. Our approach is different as we strive to reconstruct a 3D object within much more difficult scenarios of limited data availability. Data stream is often limited by insufficient depth camera coverage and, as a result, the objects are occluded and data is lost. Our proposed hybrid artificial neural network modifications have improved the reconstruction results by 8.53 which allows us for much more precise filling of occluded object sides and reduction of noise during the process. Furthermore, the addition of object segmentation masks and the individual object instance classification is a leap forward towards a general-purpose scene reconstruction as opposed to a single object reconstruction task due to the ability to mask out overlapping object instances and using only masked object area in the reconstruction process

    Fusion of information from multiple Kinect sensors for 3D object reconstruction

    Get PDF
    Основная статьяIn this paper, we estimate the accuracy of 3D object reconstruction using multiple Kinect sensors. First, we discuss the calibration of multiple Kinect sensors, and provide an analysis of the accuracy and resolution of its the depth data. Next, the precision of coordinate mapping between sensors data for registration of depth and color images is evaluated. We test a system with four Kinect V2 sensors and present reconstruction accuracy results. Experiments and computer simulation are carried out using Matlab and Kinect V2
    corecore