731 research outputs found

    Guided Filtering based Pyramidal Stereo Matching for Unrectified Images

    Get PDF
    Stereo matching deals with recovering quantitative depth information from a set of input images, based on the visual disparity between corresponding points. Generally most of the algorithms assume that the processed images are rectified. As robotics becomes popular, conducting stereo matching in the context of cloth manipulation, such as obtaining the disparity map of the garments from the two cameras of the cloth folding robot, is useful and challenging. This is resulted from the fact of the high efficiency, accuracy and low memory requirement under the usage of high resolution images in order to capture the details (e.g. cloth wrinkles) for the given application (e.g. cloth folding). Meanwhile, the images can be unrectified. Therefore, we propose to adapt guided filtering algorithm into the pyramidical stereo matching framework that works directly for unrectified images. To evaluate the proposed unrectified stereo matching in terms of accuracy, we present three datasets that are suited to especially the characteristics of the task of cloth manipulations. By com- paring the proposed algorithm with two baseline algorithms on those three datasets, we demonstrate that our proposed approach is accurate, efficient and requires low memory. This also shows that rather than relying on image rectification, directly applying stereo matching through the unrectified images can be also quite effective and meanwhile efficien

    Miniaturized embedded stereo vision system (MESVS)

    Get PDF
    Stereo vision is one of the fundamental problems of computer vision. It is also one of the oldest and heavily investigated areas of 3D vision. Recent advances of stereo matching methodologies and availability of high performance and efficient algorithms along with availability of fast and affordable hardware technology, have allowed researchers to develop several stereo vision systems capable of operating at real-time. Although a multitude of such systems exist in the literature, the majority of them concentrates only on raw performance and quality rather than factors such as dimension, and power requirement, which are of significant importance in the embedded settings. In this thesis a new miniaturized embedded stereo vision system (MESVS) is presented, which is miniaturized to fit within a package of 5x5cm, is power efficient, and cost-effective. Furthermore, through application of embedded programming techniques and careful optimization, MESVS achieves the real-time performance of 20 frames per second. This work discusses the various challenges involved regarding design and implementation of this system and the measures taken to tackle them

    An Efficient Image-Based Telepresence System for Videoconferencing

    Full text link

    Real-Time High-Resolution Multiple-Camera Depth Map Estimation Hardware and Its Applications

    Get PDF
    Depth information is used in a variety of 3D based signal processing applications such as autonomous navigation of robots and driving systems, object detection and tracking, computer games, 3D television, and free view-point synthesis. These applications require high accuracy and speed performances for depth estimation. Depth maps can be generated using disparity estimation methods, which are obtained from stereo matching between multiple images. The computational complexity of disparity estimation algorithms and the need of large size and bandwidth for the external and internal memory make the real-time processing of disparity estimation challenging, especially for high resolution images. This thesis proposes a high-resolution high-quality multiple-camera depth map estimation hardware. The proposed hardware is verified in real-time with a complete system from the initial image capture to the display and applications. The details of the complete system are presented. The proposed binocular and trinocular adaptive window size disparity estimation algorithms are carefully designed to be suitable to real-time hardware implementation by allowing efficient parallel and local processing while providing high-quality results. The proposed binocular and trinocular disparity estimation hardware implementations can process 55 frames per second on a Virtex-7 FPGA at a 1024 x 768 XGA video resolution for a 128 pixel disparity range. The proposed binocular disparity estimation hardware provides best quality compared to existing real-time high-resolution disparity estimation hardware implementations. A novel compressed-look up table based rectification algorithm and its real-time hardware implementation are presented. The low-complexity decompression process of the rectification hardware utilizes a negligible amount of LUT and DFF resources of the FPGA while it does not require the existence of external memory. The first real-time high-resolution free viewpoint synthesis hardware utilizing three-camera disparity estimation is presented. The proposed hardware generates high-quality free viewpoint video in real-time for any horizontally aligned arbitrary camera positioned between the leftmost and rightmost physical cameras. The full embedded system of the depth estimation is explained. The presented embedded system transfers disparity results together with synchronized RGB pixels to the PC for application development. Several real-time applications are developed on a PC using the obtained RGB+D results. The implemented depth estimation based real-time software applications are: depth based image thresholding, speed and distance measurement, head-hands-shoulders tracking, virtual mouse using hand tracking and face tracking integrated with free viewpoint synthesis. The proposed binocular disparity estimation hardware is implemented in an ASIC. The ASIC implementation of disparity estimation imposes additional constraints with respect to the FPGA implementation. These restrictions, their implemented efficient solutions and the ASIC implementation results are presented. In addition, a very high-resolution (82.3 MP) 360°x90° omnidirectional multiple camera system is proposed. The hemispherical camera system is able to view the target locations close to horizontal plane with more than two cameras. Therefore, it can be used in high-resolution 360° depth map estimation and its applications in the future

    Stereo visual simultaneous localisation and mapping for an outdoor wheeled robot: a front-end study

    Get PDF
    For many mobile robotic systems, navigating an environment is a crucial step in autonomy and Visual Simultaneous Localisation and Mapping (vSLAM) has seen increased effective usage in this capacity. However, vSLAM is strongly dependent on the context in which it is applied, often using heuristic and special cases to provide efficiency and robustness. It is thus crucial to identify the important parameters and factors regarding a particular context as this heavily influences the necessary algorithms, processes, and hardware required for the best results. In this body of work, a generic front-end stereo vSLAM pipeline is tested in the context of a small-scale outdoor wheeled robot that occupies less than 1m3 of volume. The scale of the vehicle constrained the available processing power, Field Of View (FOV), actuation systems, and image distortions present. A dataset was collected with a custom platform that consisted of a Point Grey Bumblebee (Discontinued) stereo camera and Nvidia Jetson TK1 processor. A stereo front-end feature tracking framework was described and evaluated both in simulation and experimentally where appropriate. It was found that scale adversely affected lighting conditions, FOV, baseline, and processing power available, all crucial factors to improve upon. The stereo constraint was effective for robustness criteria, but ineffective in terms of processing power and metric reconstruction. An overall absolute odometer error of 0.25-3m was produced on the dataset but was unable to run in real-time

    A novel algorithm and hardware architecture for fast video-based shape reconstruction of space debris

    Get PDF
    In order to enable the non-cooperative rendezvous, capture, and removal of large space debris, automatic recognition of the target is needed. Video-based techniques are the most suitable in the strict context of space missions, where low-energy consumption is fundamental, and sensors should be passive in order to avoid any possible damage to external objects as well as to the chaser satellite. This paper presents a novel fast shape-from-shading (SfS) algorithm and a field-programmable gate array (FPGA)-based system hardware architecture for video-based shape reconstruction of space debris. The FPGA-based architecture, equipped with a pair of cameras, includes a fast image pre-processing module, a core implementing a feature-based stereo-vision approach, and a processor that executes the novel SfS algorithm. Experimental results show the limited amount of logic resources needed to implement the proposed architecture, and the timing improvements with respect to other state-of-the-art SfS methods. The remaining resources available in the FPGA device can be exploited to integrate other vision-based techniques to improve the comprehension of debris model, allowing a fast evaluation of associated kinematics in order to select the most appropriate approach for capture of the target space debris

    Event-based neuromorphic stereo vision

    Full text link
    • …
    corecore