994 research outputs found

    Acceleration of stereo-matching on multi-core CPU and GPU

    Get PDF
    This paper presents an accelerated version of a dense stereo-correspondence algorithm for two different parallelism enabled architectures, multi-core CPU and GPU. The algorithm is part of the vision system developed for a binocular robot-head in the context of the CloPeMa 1 research project. This research project focuses on the conception of a new clothes folding robot with real-time and high resolution requirements for the vision system. The performance analysis shows that the parallelised stereo-matching algorithm has been significantly accelerated, maintaining 12x and 176x speed-up respectively for multi-core CPU and GPU, compared with non-SIMD singlethread CPU. To analyse the origin of the speed-up and gain deeper understanding about the choice of the optimal hardware, the algorithm was broken into key sub-tasks and the performance was tested for four different hardware architectures

    Anahita: A System for 3D Video Streaming with Depth Customization

    Get PDF
    Producing high-quality stereoscopic 3D content requires significantly more effort than preparing regular video footage. In order to assure good depth perception and visual comfort, 3D videos need to be carefully adjusted to specific viewing conditions before they are shown to viewers. While most stereoscopic 3D content is designed for viewing in movie theaters, where viewing conditions do not vary significantly, adapting the same content for viewing on home TV-sets, desktop displays, laptops, and mobile devices requires additional adjustments. To address this challenge, we propose a new system for 3D video streaming that provides automatic depth adjustments as one of its key features. Our system takes into account both the content and the display type in order to customize 3D videos and maximize their perceived quality. We propose a novel method for depth adjustment that is well-suited for videos of field sports such as soccer, football, and tennis. Our method is computationally efficient and it does not introduce any visual artifacts. We have implemented our 3D streaming system and conducted two user studies, which show: (i) adapting stereoscopic 3D videos for different displays is beneficial, and (ii) our proposed system can achieve up to 35% improvement in the perceived quality of the stereoscopic 3D content

    Loss-resilient Coding of Texture and Depth for Free-viewpoint Video Conferencing

    Full text link
    Free-viewpoint video conferencing allows a participant to observe the remote 3D scene from any freely chosen viewpoint. An intermediate virtual viewpoint image is commonly synthesized using two pairs of transmitted texture and depth maps from two neighboring captured viewpoints via depth-image-based rendering (DIBR). To maintain high quality of synthesized images, it is imperative to contain the adverse effects of network packet losses that may arise during texture and depth video transmission. Towards this end, we develop an integrated approach that exploits the representation redundancy inherent in the multiple streamed videos a voxel in the 3D scene visible to two captured views is sampled and coded twice in the two views. In particular, at the receiver we first develop an error concealment strategy that adaptively blends corresponding pixels in the two captured views during DIBR, so that pixels from the more reliable transmitted view are weighted more heavily. We then couple it with a sender-side optimization of reference picture selection (RPS) during real-time video coding, so that blocks containing samples of voxels that are visible in both views are more error-resiliently coded in one view only, given adaptive blending will erase errors in the other view. Further, synthesized view distortion sensitivities to texture versus depth errors are analyzed, so that relative importance of texture and depth code blocks can be computed for system-wide RPS optimization. Experimental results show that the proposed scheme can outperform the use of a traditional feedback channel by up to 0.82 dB on average at 8% packet loss rate, and by as much as 3 dB for particular frames

    Real-time on-board obstacle avoidance for UAVs based on embedded stereo vision

    Get PDF
    In order to improve usability and safety, modern unmanned aerial vehicles (UAVs) are equipped with sensors to monitor the environment, such as laser-scanners and cameras. One important aspect in this monitoring process is to detect obstacles in the flight path in order to avoid collisions. Since a large number of consumer UAVs suffer from tight weight and power constraints, our work focuses on obstacle avoidance based on a lightweight stereo camera setup. We use disparity maps, which are computed from the camera images, to locate obstacles and to automatically steer the UAV around them. For disparity map computation we optimize the well-known semi-global matching (SGM) approach for the deployment on an embedded FPGA. The disparity maps are then converted into simpler representations, the so called U-/V-Maps, which are used for obstacle detection. Obstacle avoidance is based on a reactive approach which finds the shortest path around the obstacles as soon as they have a critical distance to the UAV. One of the fundamental goals of our work was the reduction of development costs by closing the gap between application development and hardware optimization. Hence, we aimed at using high-level synthesis (HLS) for porting our algorithms, which are written in C/C++, to the embedded FPGA. We evaluated our implementation of the disparity estimation on the KITTI Stereo 2015 benchmark. The integrity of the overall realtime reactive obstacle avoidance algorithm has been evaluated by using Hardware-in-the-Loop testing in conjunction with two flight simulators.Comment: Accepted in the International Archives of the Photogrammetry, Remote Sensing and Spatial Information Scienc

    Cross-layer Optimized Wireless Video Surveillance

    Get PDF
    A wireless video surveillance system contains three major components, the video capture and preprocessing, the video compression and transmission over wireless sensor networks (WSNs), and the video analysis at the receiving end. The coordination of different components is important for improving the end-to-end video quality, especially under the communication resource constraint. Cross-layer control proves to be an efficient measure for optimal system configuration. In this dissertation, we address the problem of implementing cross-layer optimization in the wireless video surveillance system. The thesis work is based on three research projects. In the first project, a single PTU (pan-tilt-unit) camera is used for video object tracking. The problem studied is how to improve the quality of the received video by jointly considering the coding and transmission process. The cross-layer controller determines the optimal coding and transmission parameters, according to the dynamic channel condition and the transmission delay. Multiple error concealment strategies are developed utilizing the special property of the PTU camera motion. In the second project, the binocular PTU camera is adopted for video object tracking. The presented work studied the fast disparity estimation algorithm and the 3D video transcoding over the WSN for real-time applications. The disparity/depth information is estimated in a coarse-to-fine manner using both local and global methods. The transcoding is coordinated by the cross-layer controller based on the channel condition and the data rate constraint, in order to achieve the best view synthesis quality. The third project is applied for multi-camera motion capture in remote healthcare monitoring. The challenge is the resource allocation for multiple video sequences. The presented cross-layer design incorporates the delay sensitive, content-aware video coding and transmission, and the adaptive video coding and transmission to ensure the optimal and balanced quality for the multi-view videos. In these projects, interdisciplinary study is conducted to synergize the surveillance system under the cross-layer optimization framework. Experimental results demonstrate the efficiency of the proposed schemes. The challenges of cross-layer design in existing wireless video surveillance systems are also analyzed to enlighten the future work. Adviser: Song C

    Cross-layer Optimized Wireless Video Surveillance

    Get PDF
    A wireless video surveillance system contains three major components, the video capture and preprocessing, the video compression and transmission over wireless sensor networks (WSNs), and the video analysis at the receiving end. The coordination of different components is important for improving the end-to-end video quality, especially under the communication resource constraint. Cross-layer control proves to be an efficient measure for optimal system configuration. In this dissertation, we address the problem of implementing cross-layer optimization in the wireless video surveillance system. The thesis work is based on three research projects. In the first project, a single PTU (pan-tilt-unit) camera is used for video object tracking. The problem studied is how to improve the quality of the received video by jointly considering the coding and transmission process. The cross-layer controller determines the optimal coding and transmission parameters, according to the dynamic channel condition and the transmission delay. Multiple error concealment strategies are developed utilizing the special property of the PTU camera motion. In the second project, the binocular PTU camera is adopted for video object tracking. The presented work studied the fast disparity estimation algorithm and the 3D video transcoding over the WSN for real-time applications. The disparity/depth information is estimated in a coarse-to-fine manner using both local and global methods. The transcoding is coordinated by the cross-layer controller based on the channel condition and the data rate constraint, in order to achieve the best view synthesis quality. The third project is applied for multi-camera motion capture in remote healthcare monitoring. The challenge is the resource allocation for multiple video sequences. The presented cross-layer design incorporates the delay sensitive, content-aware video coding and transmission, and the adaptive video coding and transmission to ensure the optimal and balanced quality for the multi-view videos. In these projects, interdisciplinary study is conducted to synergize the surveillance system under the cross-layer optimization framework. Experimental results demonstrate the efficiency of the proposed schemes. The challenges of cross-layer design in existing wireless video surveillance systems are also analyzed to enlighten the future work. Adviser: Song C

    HEVC based Stereo Video codec

    Get PDF
    Development of stereo video codecs in latest multi-view extension of HEVC (MV-HEVC) with higher compression efficiency has been an active area of research. In this paper, a frame interleaved stereo video coding scheme based on MVHEVC standard codec is proposed. The proposed codec applies a reduced layer approach to encode the frame interleaved stereo sequences. A frame interleaving algorithm is developed to reorder the stereo video frames into a monocular video, such that the proposed codec can gain advantage from inter-views and temporal correlations to improve its coding performance. To evaluate the performance of the proposed codec; three standard multi-view test video sequences, named “Poznan_Street”, “Kendo” and “Newspaper1”, were selected and coded using the proposed codec and the standard MV-HEVC codec at different QPs and bitrates. Experimental results show that the proposed codec gives a significantly higher coding performance to that of the standard MV-HEVC codec at all bitrates

    Compression of Stereo Disparity Streams Using Wavelets and Optical Flow

    Get PDF
    Recent advances in computing have enabled fast reconstructions of dynamic scenes from multiple images. However, the efficient coding of changing 3D-data has hardly been addressed. Progressive geometric compression and streaming are based on static data sets which are mostly artificial or obtained from accurate range sensors. In this paper, we present a system for efficient coding of 3D-data which are given in forms of 2 + 1/2 disparity maps. Disparity maps are spatially coded using wavelets and temporally predicted by computing flow. The resulted representation of a 3D-stream consists then of spatial wavelet coefficients, optical flow vectors, and disparity differences between predicted and incoming image. The approach has also very useful by-products: disparity predictions can significantly reduce the disparity search range and if appropriately modeled increase the accuracy of depth estimation

    HEVC based Mixed-resolution Stereo Video Coding for Low Bitrate Transmission

    Get PDF
    This paper presents a mixed resolution stereo video coding model for High Efficiency Video Codec (HEVC). The challenging aspects of mixed resolution video coding are enabling the codec to encode frames with different frame resolution/size and using decoded pictures having different frame resolution/size for referencing. These challenges are further enlarged when implemented using HEVC, since the incoming video frames are subdivided into coding tree units. The ingenuity of the proposed codec’s design, is that the information in intermediate frames are down-sampled and yet the frames can retain the original resolution. To enable random access to full resolution decoded frame in the decoded picture buffer as reference frame a downsampled version of the decoded full resolution frame is used. The test video sequences were coded using the proposed codec and standard MV-HEVC. Results show that the proposed codec gives a significantly higher coding performance over the MV- HEVC codec
    • …
    corecore