182 research outputs found

    Complexity management of H.264/AVC video compression.

    Get PDF
    The H. 264/AVC video coding standard offers significantly improved compression efficiency and flexibility compared to previous standards. However, the high computational complexity of H. 264/AVC is a problem for codecs running on low-power hand held devices and general purpose computers. This thesis presents new techniques to reduce, control and manage the computational complexity of an H. 264/AVC codec. A new complexity reduction algorithm for H. 264/AVC is developed. This algorithm predicts "skipped" macroblocks prior to motion estimation by estimating a Lagrange ratedistortion cost function. Complexity savings are achieved by not processing the macroblocks that are predicted as "skipped". The Lagrange multiplier is adaptively modelled as a function of the quantisation parameter and video sequence statistics. Simulation results show that this algorithm achieves significant complexity savings with a negligible loss in rate-distortion performance. The complexity reduction algorithm is further developed to achieve complexity-scalable control of the encoding process. The Lagrangian cost estimation is extended to incorporate computational complexity. A target level of complexity is maintained by using a feedback algorithm to update the Lagrange multiplier associated with complexity. Results indicate that scalable complexity control of the encoding process can be achieved whilst maintaining near optimal complexity-rate-distortion performance. A complexity management framework is proposed for maximising the perceptual quality of coded video in a real-time processing-power constrained environment. A real-time frame-level control algorithm and a per-frame complexity control algorithm are combined in order to manage the encoding process such that a high frame rate is maintained without significantly losing frame quality. Subjective evaluations show that the managed complexity approach results in higher perceptual quality compared to a reference encoder that drops frames in computationally constrained situations. These novel algorithms are likely to be useful in implementing real-time H. 264/AVC standard encoders in computationally constrained environments such as low-power mobile devices and general purpose computers

    Investigating low-bitrate, low-complexity H.264 region of interest techniques in error-prone environments

    Get PDF
    The H.264/AVC video coding standard leverages advanced compression methods to provide a significant increase in performance over previous CODECs in terms of picture quality, bitrate, and flexibility. The specification itself provides several profiles and levels that allow customization through the use of various advanced features. In addition to these features, several new video coding techniques have been developed since the standard\u27s inception. One such technique known as Region of Interest (RoI) coding has been in existence since before H.264\u27s formalization, and several means of implementing RoI coding in H.264 have been proposed. Region of Interest coding operates under the assumption that one or more regions of a sequence have higher priority than the rest of the video. One goal of RoI coding is to provide a decrease in bitrate without significant loss of perceptual quality, and this is particularly applicable to low complexity environments, if the proper implementation is used. Furthermore, RoI coding may allow for enhanced error resilience in the selected regions if desired, making RoI suitable for both low-bitrate and error-prone scenarios. The goal of this thesis project was to examine H.264 Region of Interest coding as it applies to such scenarios. A modified version of the H.264 JM Reference Software was created in which all non-Baseline profile features were removed. Six low-complexity RoI coding techniques, three targeting rate control and three targeting error resilience, were selected for implementation. Error and distortion modeling tools were created to enhance the quality of experimental data. Results were gathered by varying a range of coding parameters including frame size, target bitrate, and macroblock error rates. Methods were then examined based on their rate-distortion curves, ability to achieve target bitrates accurately, and per-region distortions where applicable

    Error resilient packet switched H.264 video telephony over third generation networks.

    Get PDF
    Real-time video communication over wireless networks is a challenging problem because wireless channels suffer from fading, additive noise and interference, which translate into packet loss and delay. Since modern video encoders deliver video packets with decoding dependencies, packet loss and delay can significantly degrade the video quality at the receiver. Many error resilience mechanisms have been proposed to combat packet loss in wireless networks, but only a few were specifically designed for packet switched video telephony over Third Generation (3G) networks. The first part of the thesis presents an error resilience technique for packet switched video telephony that combines application layer Forward Error Correction (FEC) with rateless codes, Reference Picture Selection (RPS) and cross layer optimization. Rateless codes have lower encoding and decoding computational complexity compared to traditional error correcting codes. One can use them on complexity constrained hand-held devices. Also, their redundancy does not need to be fixed in advance and any number of encoded symbols can be generated on the fly. Reference picture selection is used to limit the effect of spatio-temporal error propagation. Limiting the effect of spatio-temporal error propagation results in better video quality. Cross layer optimization is used to minimize the data loss at the application layer when data is lost at the data link layer. Experimental results on a High Speed Packet Access (HSPA) network simulator for H.264 compressed standard video sequences show that the proposed technique achieves significant Peak Signal to Noise Ratio (PSNR) and Percentage Degraded Video Duration (PDVD) improvements over a state of the art error resilience technique known as Interactive Error Control (IEC), which is a combination of Error Tracking and feedback based Reference Picture Selection. The improvement is obtained at a cost of higher end-to-end delay. The proposed technique is improved by making the FEC (Rateless code) redundancy channel adaptive. Automatic Repeat Request (ARQ) is used to adjust the redundancy of the Rateless codes according to the channel conditions. Experimental results show that the channel adaptive scheme achieves significant PSNR and PDVD improvements over the static scheme for a simulated Long Term Evolution (LTE) network. In the third part of the thesis, the performance of the previous two schemes is improved by making the transmitter predict when rateless decoding will fail. In this case, reference picture selection is invoked early and transmission of encoded symbols for that source block is aborted. Simulations for an LTE network show that this results in video quality improvement and bandwidth savings. In the last part of the thesis, the performance of the adaptive technique is improved by exploiting the history of the wireless channel. In a Rayleigh fading wireless channel, the RLC-PDU losses are correlated under certain conditions. This correlation is exploited to adjust the redundancy of the Rateless code and results in higher Rateless code decoding success rate and higher video quality. Simulations for an LTE network show that the improvement was significant when the packet loss rate in the two wireless links was 10%. To facilitate the implementation of the proposed error resilience techniques in practical scenarios, RTP/UDP/IP level packetization schemes are also proposed for each error resilience technique. Compared to existing work, the proposed error resilience techniques provide better video quality. Also, more emphasis is given to implementation issues in 3G networks

    MPEG-4 natural video coding - An overview

    Get PDF
    This paper describes the MPEG-4 standard, as defined in ISO/IEC 14496-2. The MPEG-4 visual standard is developed to provide users a new level of interaction with visual contents. It provides technologies to view, access and manipulate objects rather than pixels, with great error robustness at a large range of bit-rates. Application areas range from digital television, streaming video, to mobile multimedia and games. The MPEG-4 natural video standard consists of a collection of tools that support these application areas. The standard provides tools for shape coding, motion estimation and compensation, texture coding, error resilience, sprite coding and scalability. Conformance points in the form of object types, profiles and levels, provide the basis for interoperability. Shape coding can be performed in binary mode, where the shape of each object is described by a binary mask, or in gray scale mode, where the shape is described in a form similar to an alpha channel, allowing transparency, and reducing aliasing. Motion compensation is block based, with appropriate modifications for object boundaries. The block size can be 16×16, or 8×8, with half pixel resolution. MPEG-4 also provides a mode for overlapped motion compensation. Texture coding is based in 8×8 DCT, with appropriate modifications for object boundary blocks. Coefficient prediction is possible to improve coding efficiency. Static textures can be encoded using a wavelet transform. Error resilience is provided by resynchronization markers, data partitioning, header extension codes, and reversible variable length codes. Scalability is provided for both spatial and temporal resolution enhancement. MPEG-4 provides scalability on an object basis, with the restriction that the object shape has to be rectangular. MPEG-4 conformance points are defined at the Simple Profile, the Core Profile, and the Main Profile. Simple Profile and Core Profiles address typical scene sizes of QCIF and CIF size, with bit-rates of 64, 128, 384 and 2 Mbit/s. Main Profile addresses a typical scene sizes of CIF, ITU-R 601 and HD, with bit-rates at 2, 15 and 38.4 Mbit/s

    Flexible Macroblock Ordering for Context-Aware Ultrasound Video Transmission over Mobile WiMAX

    Get PDF
    The most recent network technologies are enabling a variety of new applications, thanks to the provision of increased bandwidth and better management of Quality of Service. Nevertheless, telemedical services involving multimedia data are still lagging behind, due to the concern of the end users, that is, clinicians and also patients, about the low quality provided. Indeed, emerging network technologies should be appropriately exploited by designing the transmission strategy focusing on quality provision for end users. Stemming from this principle, we propose here a context-aware transmission strategy for medical video transmission over WiMAX systems. Context, in terms of regions of interest (ROI) in a specific session, is taken into account for the identification of multiple regions of interest, and compression/transmission strategies are tailored to such context information. We present a methodology based on H.264 medical video compression and Flexible Macroblock Ordering (FMO) for ROI identification. Two different unequal error protection methodologies, providing higher protection to the most diagnostically relevant data, are presented

    Computational Complexity Optimization on H.264 Scalable/Multiview Video Coding

    Get PDF
    The H.264/MPEG-4 Advanced Video Coding (AVC) standard is a high efficiency and flexible video coding standard compared to previous standards. The high efficiency is achieved by utilizing a comprehensive full search motion estimation method. Although the H.264 standard improves the visual quality at low bitrates, it enormously increases the computational complexity. The research described in this thesis focuses on optimization of the computational complexity on H.264 scalable and multiview video coding. Nowadays, video application areas range from multimedia messaging and mobile to high definition television, and they use different type of transmission systems. The Scalable Video Coding (SVC) extension of the H.264/AVC standard is able to scale the video stream in order to adapt to a variety of devices with different capabilities. Furthermore, a rate control scheme is utilized to improve the visual quality under the constraints of capability and channel bandwidth. However, the computational complexity is increased. A simplified rate control scheme is proposed to reduce the computational complexity. In the proposed scheme, the quantisation parameter can be computed directly instead of using the exhaustive Rate-Quantization model. The linear Mean Absolute Distortion (MAD) prediction model is used to predict the scene change, and the quantisation parameter will be increased directly by a threshold when the scene changes abruptly; otherwise, the comprehensive Rate-Quantisation model will be used. Results show that the optimized rate control scheme is efficient on time saving. Multiview Video Coding (MVC) is efficient on reducing the huge amount of data in multiple-view video coding. The inter-view reference frames from the adjacent views are exploited for prediction in addition to the temporal prediction. However, due to the increase in the number of reference frames, the computational complexity is also increased. In order to manage the reference frame efficiently, a phase correlation algorithm is utilized to remove the inefficient inter-view reference frame from the reference list. The dependency between the inter-view reference frame and current frame is decided based on the phase correlation coefficients. If the inter-view reference frame is highly related to the current frame, it is still enabled in the reference list; otherwise, it will be disabled. The experimental results show that the proposed scheme is efficient on time saving and without loss in visual quality and increase in bitrate. The proposed optimization algorithms are efficient in reducing the computational complexity on H.264/AVC extension. The low computational complexity algorithm is useful in the design of future video coding standards, especially on low power handheld devices

    Error concealment techniques for H.264/MVC encoded sequences

    Get PDF
    This work is partially funded by the Strategic Educational Pathways Scholarship Scheme (STEPS-Malta). This scholarship is partly financed by the European Union–European Social Fund (ESF 1.25).The H.264/MVC standard offers good compression ratios for multi-view sequences by exploiting spatial, temporal and interview image dependencies. This works well in error-free channels, however in the event of transmission errors, it leads to the propagation of the distorted macro-blocks, degrading the quality of experience of the user. This paper reviews the state-of-the-art error concealment solutions and proposes a low complexity concealment method that can be used with multi-view video coding. The error resilience techniques used to aid error concealment are also identified. Results obtained demonstrate that good multi-view video reconstruction can be obtained with this approach.peer-reviewe

    Improved quality block-based low bit rate video coding.

    Get PDF
    The aim of this research is to develop algorithms for enhancing the subjective quality and coding efficiency of standard block-based video coders. In the past few years, numerous video coding standards based on motion-compensated block-transform structure have been established where block-based motion estimation is used for reducing the correlation between consecutive images and block transform is used for coding the resulting motion-compensated residual images. Due to the use of predictive differential coding and variable length coding techniques, the output data rate exhibits extreme fluctuations. A rate control algorithm is devised for achieving a stable output data rate. This rate control algorithm, which is essentially a bit-rate estimation algorithm, is then employed in a bit-allocation algorithm for improving the visual quality of the coded images, based on some prior knowledge of the images. Block-based hybrid coders achieve high compression ratio mainly due to the employment of a motion estimation and compensation stage in the coding process. The conventional bit-allocation strategy for these coders simply assigns the bits required by the motion vectors and the rest to the residual image. However, at very low bit-rates, this bit-allocation strategy is inadequate as the motion vector bits takes up a considerable portion of the total bit-rate. A rate-constrained selection algorithm is presented where an analysis-by-synthesis approach is used for choosing the best motion vectors in term of resulting bit rate and image quality. This selection algorithm is then implemented for mode selection. A simple algorithm based on the above-mentioned bit-rate estimation algorithm is developed for the latter to reduce the computational complexity. For very low bit-rate applications, it is well-known that block-based coders suffer from blocking artifacts. A coding mode is presented for reducing these annoying artifacts by coding a down-sampled version of the residual image with a smaller quantisation step size. Its applications for adaptive source/channel coding and for coding fast changing sequences are examined
    corecore