59 research outputs found

    Motion compensation and very low bit rate video coding

    Get PDF
    Recently, many activities of the International Telecommunication Union (ITU) and the International Standard Organization (ISO) are leading to define new standards for very low bit-rate video coding, such as H.263 and MPEG-4 after successful applications of the international standards H.261 and MPEG-1/2 for video coding above 64kbps. However, at very low bit-rate the classic block matching based DCT video coding scheme suffers seriously from blocking artifacts which degrade the quality of reconstructed video frames considerably. To solve this problem, a new technique in which motion compensation is based on dense motion field is presented in this dissertation. Four efficient new video coding algorithms based on this new technique for very low bit-rate are proposed. (1) After studying model-based video coding algorithms, we propose an optical flow based video coding algorithm with thresh-olding techniques. A statistic model is established for distribution of intensity difference between two successive frames, and four thresholds are used to control the bit-rate and the quality of reconstructed frames. It outperforms the typical model-based techniques in terms of complexity and quality of reconstructed frames. (2) An efficient algorithm using DCT coded optical flow. It is found that dense motion fields can be modeled as the first order auto-regressive model, and efficiently compressed with DCT technique, hence achieving very low bit-rate and higher visual quality than the H.263/TMN5. (3) A region-based discrete wavelet transform video coding algorithm. This algorithm implements dense motion field and regions are segmented according to their content significance. The DWT is applied to residual images region by region, and bits are adaptively allocated to regions. It improves the visual quality and PSNR of significant regions while maintaining low bit-rate. (4) A segmentation-based video coding algorithm for stereo sequence. A correlation-feedback algorithm with Kalman filter is utilized to improve the accuracy of optical flow fields. Three criteria, which are associated with 3-D information, 2-D connectivity and motion vector fields, respectively, are defined for object segmentation. A chain code is utilized to code the shapes of the segmented objects. it can achieve very high compression ratio up to several thousands

    A family of stereoscopic image compression algorithms using wavelet transforms

    Get PDF
    With the standardization of JPEG-2000, wavelet-based image and video compression technologies are gradually replacing the popular DCT-based methods. In parallel to this, recent developments in autostereoscopic display technology is now threatening to revolutionize the way in which consumers are used to enjoying the traditional 2-D display based electronic media such as television, computer and movies. However, due to the two-fold bandwidth/storage space requirement of stereoscopic imaging, an essential requirement of a stereo imaging system is efficient data compression. In this thesis, seven wavelet-based stereo image compression algorithms are proposed, to take advantage of the higher data compaction capability and better flexibility of wavelets. [Continues.

    A family of stereoscopic image compression algorithms using wavelet transforms

    Get PDF
    With the standardization of JPEG-2000, wavelet-based image and video compression technologies are gradually replacing the popular DCT-based methods. In parallel to this, recent developments in autostereoscopic display technology is now threatening to revolutionize the way in which consumers are used to enjoying the traditional 2D display based electronic media such as television, computer and movies. However, due to the two-fold bandwidth/storage space requirement of stereoscopic imaging, an essential requirement of a stereo imaging system is efficient data compression. In this thesis, seven wavelet-based stereo image compression algorithms are proposed, to take advantage of the higher data compaction capability and better flexibility of wavelets. In the proposed CODEC I, block-based disparity estimation/compensation (DE/DC) is performed in pixel domain. However, this results in an inefficiency when DWT is applied on the whole predictive error image that results from the DE process. This is because of the existence of artificial block boundaries between error blocks in the predictive error image. To overcome this problem, in the remaining proposed CODECs, DE/DC is performed in the wavelet domain. Due to the multiresolution nature of the wavelet domain, two methods of disparity estimation and compensation have been proposed. The first method is performing DEJDC in each subband of the lowest/coarsest resolution level and then propagating the disparity vectors obtained to the corresponding subbands of higher/finer resolution. Note that DE is not performed in every subband due to the high overhead bits that could be required for the coding of disparity vectors of all subbands. This method is being used in CODEC II. In the second method, DEJDC is performed m the wavelet-block domain. This enables disparity estimation to be performed m all subbands simultaneously without increasing the overhead bits required for the coding disparity vectors. This method is used by CODEC III. However, performing disparity estimation/compensation in all subbands would result in a significant improvement of CODEC III. To further improve the performance of CODEC ill, pioneering wavelet-block search technique is implemented in CODEC IV. The pioneering wavelet-block search technique enables the right/predicted image to be reconstructed at the decoder end without the need of transmitting the disparity vectors. In proposed CODEC V, pioneering block search is performed in all subbands of DWT decomposition which results in an improvement of its performance. Further, the CODEC IV and V are able to perform at very low bit rates(< 0.15 bpp). In CODEC VI and CODEC VII, Overlapped Block Disparity Compensation (OBDC) is used with & without the need of coding disparity vector. Our experiment results showed that no significant coding gains could be obtained for these CODECs over CODEC IV & V. All proposed CODECs m this thesis are wavelet-based stereo image coding algorithms that maximise the flexibility and benefits offered by wavelet transform technology when applied to stereo imaging. In addition the use of a baseline-JPEG coding architecture would enable the easy adaptation of the proposed algorithms within systems originally built for DCT-based coding. This is an important feature that would be useful during an era where DCT-based technology is only slowly being phased out to give way for DWT based compression technology. In addition, this thesis proposed a stereo image coding algorithm that uses JPEG-2000 technology as the basic compression engine. The proposed CODEC, named RASTER is a rate scalable stereo image CODEC that has a unique ability to preserve the image quality at binocular depth boundaries, which is an important requirement in the design of stereo image CODEC. The experimental results have shown that the proposed CODEC is able to achieve PSNR gains of up to 3.7 dB as compared to directly transmitting the right frame using JPEG-2000

    An application specific low bit-rate video compression system geared towards vehicle tracking.

    Get PDF
    Thesis (M.Sc.Eng.)-University of Natal, Durban, 2003.The ability to communicate over a low bit-rate transmission channel has become the order of the day. In the past, transmitted data over a low bit-rate transmission channel, such as a wireless link, has typically been reserved for speech and data. However, there is currently a great deal of interest being shown in the ability to transmit streaming video over such a link. These transmission channels are generally bandwidth limited hence bit-rates need to be low. Video on the other hand requires large amounts of bandwidth for real-time streaming applications. Existing Video Compression standards such as MPEG-l/2 have succeeded in reducing the bandwidth required for transmission by exploiting redundant video information in both the spatial and temporal domains. However such compression systems are geared towards general applications hence they tend not to be suitable for low bit-rate applications. The objective of this work is to implement such a system. Following an investigation in the field of video compression, existing techniques have been adapted and integrated into an application specific low bit-rate video compression system. The implemented system is application specific as it has been designed to track vehicles of reasonable size within an otherwise static scene. Low bit-rate video is achieved by separating a video scene into two areas of interest, namely the background scene and objects that move with reference to this background. Once the background has been compressed and transmitted to the decoder, the only data that is subsequently transmitted is that that has resulted from the segmentation and tracking of vehicles within the scene. This data is normally small in comparison with that of the background scene and therefore by only updating the background periodically, the resulting average output bit-rate is low. The implemented system is divided into two parts, namely a still image encoder and decoder based on a Variable Block-Size Discrete Cosine Transform, and a context-specific encoder and decoder that tracks vehicles in motion within a video scene. The encoder system has been implemented on the Philips TriMedia TM-1300 digital signal processor (DSP). The encoder is able to capture streaming video, compress individual video frames as well as track objects in motion within a video scene. The decoder on the other hand has been implemented on the host PC in which the TriMedia DSP is plugged. A graphic user interface allows a system operator to control the compression system by configuring various compression variables. For demonstration purposes, the host PC displays the decoded video stream as well as calculated rate metrics such as peak signal to noise ratio and resultant bit-rate. The implementation of the compression system is described whilst incorporating application examples and results. Conclusions are drawn and suggestions for further improvement are offered

    Semi-automatic video object segmentation for multimedia applications

    Get PDF
    A semi-automatic video object segmentation tool is presented for segmenting both still pictures and image sequences. The approach comprises both automatic segmentation algorithms and manual user interaction. The still image segmentation component is comprised of a conventional spatial segmentation algorithm (Recursive Shortest Spanning Tree (RSST)), a hierarchical segmentation representation method (Binary Partition Tree (BPT)), and user interaction. An initial segmentation partition of homogeneous regions is created using RSST. The BPT technique is then used to merge these regions and hierarchically represent the segmentation in a binary tree. The semantic objects are then manually built by selectively clicking on image regions. A video object-tracking component enables image sequence segmentation, and this subsystem is based on motion estimation, spatial segmentation, object projection, region classification, and user interaction. The motion between the previous frame and the current frame is estimated, and the previous object is then projected onto the current partition. A region classification technique is used to determine which regions in the current partition belong to the projected object. User interaction is allowed for object re-initialisation when the segmentation results become inaccurate. The combination of all these components enables offline video sequence segmentation. The results presented on standard test sequences illustrate the potential use of this system for object-based coding and representation of multimedia

    Fitting and tracking of a scene model in very low bit rate video coding

    Get PDF

    Fast Motion Estimation Algorithms for Block-Based Video Coding Encoders

    Get PDF
    The objective of my research is reducing the complexity of video coding standards in real-time scalable and multi-view applications

    Error-resilient multi-view video plus depth based 3-D video coding

    Get PDF
    Three Dimensional (3-D) video, by definition, is a collection of signals that can provide depth perception of a 3-D scene. With the development of 3-D display technologies and interactive multimedia systems, 3-D video has attracted significant interest from both industries and academia with a variety of applications. In order to provide desired services in various 3-D video applications, the multiview video plus depth (MVD) representation, which can facilitate the generation of virtual views, has been determined to be the best format for 3-D video data. Similar to 2-D video, compressed 3-D video is highly sensitive to transmission errors due to errors propagated from the current frame to the future predicted frames. Moreover, since the virtual views required for auto-stereoscopic displays are rendered from the compressed texture videos and depth maps, transmission errors of the distorted texture videos and depth maps can be further propagated to the virtual views. Besides, the distortions in texture and depth show different effects on the rendering views. Therefore, compared to the reliability of the transmission of the 2-D video, error-resilient texture video and depth map coding are facing major new challenges. This research concentrates on improving the error resilience performance of MVD-based 3-D video in packet loss scenarios. Based on the analysis of the propagating behaviour of transmission errors, a Wyner-Ziv (WZ)-based error-resilient algorithm is first designed for coding of the multi-view video data or depth data. In this scheme, an auxiliary redundant stream encoded according to WZ principle is employed to protect a primary stream encoded with standard multi-view video coding codec. Then, considering the fact that different combinations of texture and depth coding mode will exhibit varying robustness to transmission errors, a rate-distortion optimized mode switching scheme is proposed to strike the optimal trade-off between robustness and compression effciency. In this approach, the texture and depth modes are jointly optimized by minimizing the overall distortion of both the coded and synthesized views subject to a given bit rate. Finally, this study extends the research on the reliable transmission of view synthesis prediction (VSP)-based 3-D video. In order to mitigate the prediction position error caused by packet losses in the depth map, a novel disparity vector correction algorithm is developed, where the corrected disparity vector is calculated from the depth error. To facilitate decoder error concealment, the depth error is recursively estimated at the decoder. The contributions of this dissertation are multifold. First, the proposed WZbased error-resilient algorithm can accurately characterize the effect of transmission error on multi-view distortion at the transform domain in consideration of both temporal and inter-view error propagation, and based on the estimated distortion, this algorithm can perform optimal WZ bit allocation at the encoder through explicitly developing a sophisticated rate allocation strategy. This proposed algorithm is able to provide a finer granularity in performing rate adaptivity and unequal error protection for multi-view data, not only at the frame level, but also at the bit-plane level. Secondly, in the proposed mode switching scheme, a new analytic model is formulated to optimally estimate the view synthesis distortion due to packet losses, in which the compound impact of the transmission distortions of both the texture video and the depth map on the quality of the synthesized view is mathematically analysed. The accuracy of this view synthesis distortion model is demonstrated via simulation results and, further, the estimated distortion is integrated into a rate-distortion framework for optimal mode switching to achieve substantial performance gains over state-of-the-art algorithms. Last, but not least, this dissertation provides a preliminary investigation of VSP-based 3-D video over unreliable channel. In the proposed disparity vector correction algorithm, the pixel-level depth map error can be precisely estimated at the decoder without the deterministic knowledge of the error-free reconstructed depth. The approximation of the innovation term involved in depth error estimation is proved theoretically. This algorithm is very useful to conceal the position-erroneous pixels whose disparity vectors are correctly received
    corecore