34 research outputs found
Image enhancements for low-bitrate videocoding
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1996.Includes bibliographical references (p. 71).by Brian C. Davison.M.Eng
A survey on video compression fast block matching algorithms
Video compression is the process of reducing the amount of data required to represent digital video while preserving an acceptable video quality. Recent studies on video compression have focused on multimedia transmission, videophones, teleconferencing, high definition television, CD-ROM storage, etc. The idea of compression techniques is to remove the redundant information that exists in the video sequences.
Motion compensation predictive coding is the main coding tool for removing temporal redundancy of video sequences and it typically accounts for 50â80% of video encoding complexity. This technique has been adopted by all of the existing International Video Coding Standards. It assumes that the current frame can be locally modelled as a translation of the reference frames. The practical and widely method used to carry out motion compensated prediction is block matching algorithm. In this method, video frames are divided into a set of non-overlapped macroblocks and compared with the search area in the reference frame in order to find the best matching macroblock. This will carry out displacement vectors that stipulate the movement of the macroblocks from one location to another in the reference frame. Checking all these locations is called Full Search, which provides the best result. However, this algorithm suffers from long computational time, which necessitates improvement. Several methods of Fast Block Matching algorithm are developed to reduce the computation complexity.
This paper focuses on a survey for two video compression techniques: the first is called the lossless block matching algorithm process, in which the computational time required to determine the matching macroblock of the Full Search is decreased while the resolution of the predicted frames is the same as for the Full Search. The second is called lossy block matching algorithm process, which reduces the computational complexity effectively but the search result's quality is not the same as for the Full Search
Motion compensation for image compression: pel-recursive motion estimation algorithm
In motion pictures there is a certain amount of redundancy between consecutive frames. These redundancies can be exploited by using interframe prediction techniques. To further enhance the efficiency of interframe prediction, motion estimation and compensation, various motion compensation techniques can be used. There are two distinct techniques for motion estimation block matching and pel-recursive block matching has been widely used as it produces a better signal-to-noise ratio or a lower bit rate for transmission than the pel-recursive method. In this thesis, various pel-recursive motion estimation techniques such as steepest descent gradient algorithm have been considered and simulated. [Continues.
A multi-objective performance optimisation framework for video coding
Digital video technologies have become an essential part of the way visual information is created, consumed and communicated. However, due to the unprecedented growth of digital video technologies, competition for bandwidth resources has become fierce. This has highlighted a critical need for optimising the performance of video encoders. However, there is a dual optimisation problem, wherein, the objective is to reduce the buffer and memory requirements while maintaining the quality of the encoded video. Additionally, through the analysis of existing video compression techniques, it was found that the operation of video encoders requires the optimisation of numerous decision parameters to achieve the best trade-offs between factors that affect visual quality; given the resource limitations arising from operational constraints such as memory and complexity.
The research in this thesis has focused on optimising the performance of the H.264/AVC video encoder, a process that involved finding solutions for multiple conflicting objectives. As part of this research, an automated tool for optimising video compression to achieve an optimal trade-off between bit rate and visual quality, given maximum allowed memory and computational complexity constraints, within a diverse range of scene environments, has been developed. Moreover, the evaluation of this optimisation framework has highlighted the effectiveness of the developed solution
Energy efficient enabling technologies for semantic video processing on mobile devices
Semantic object-based processing will play an increasingly important role in future multimedia systems due to the ubiquity of digital multimedia capture/playback technologies and increasing storage capacity. Although the object based paradigm has many undeniable benefits, numerous technical challenges remain before the applications becomes pervasive, particularly on computational constrained mobile devices. A fundamental issue is the ill-posed problem of semantic object segmentation. Furthermore, on battery powered mobile computing devices, the additional algorithmic complexity of semantic object based processing compared to conventional video processing is highly undesirable both from a real-time operation and battery life perspective. This
thesis attempts to tackle these issues by firstly constraining the solution space and focusing on the
human face as a primary semantic concept of use to users of mobile devices. A novel face detection algorithm is proposed, which from the outset was designed to be amenable to be offloaded from the host microprocessor to dedicated hardware, thereby providing real-time performance and
reducing power consumption. The algorithm uses an Artificial Neural Network (ANN), whose topology and weights are evolved via a genetic algorithm (GA). The computational burden of the ANN evaluation is offloaded to a dedicated hardware accelerator, which is capable of processing
any evolved network topology. Efficient arithmetic circuitry, which leverages modified Booth recoding, column compressors and carry save adders, is adopted throughout the design. To tackle the increased computational costs associated with object tracking or object based shape encoding, a novel energy efficient binary motion estimation architecture is proposed. Energy is reduced in the proposed motion estimation architecture by minimising the redundant operations inherent in the binary data. Both architectures are shown to compare favourable with the relevant prior art
Concealment algorithms for networked video transmission systems
This thesis addresses the problem of cell loss when transmitting video data over an
ATM network. Cell loss causes sections of an image to be lost or discarded in the
interconnecting nodes between the transmitting and receiving locations.
The method used to combat this problem is to use a technique called Error
Concealment, where the lost sections of an image are replaced with approximations
derived from the information in the surrounding areas to the error. This technique
does not require any additional encoding, as used by Error Correction. Conventional
techniques conceal from within the pixel domain, but require a large amount of
processing (2N2 up to 20N2) where N is the dimension of an NĂN square block.
Also, previous work at Loughborough used Linear Interpolation in the transform
domain, which required much less processing, to conceal the error. [Continues.
Efficient HEVC-based video adaptation using transcoding
In a video transmission system, it is important to take into account the great diversity of the network/end-user constraints. On the one hand, video content is typically streamed over a network that is characterized by different bandwidth capacities. In many cases, the bandwidth is insufficient to transfer the video at its original quality. On the other hand, a single video is often played by multiple devices like PCs, laptops, and cell phones. Obviously, a single video would not satisfy their different constraints.
These diversities of the network and devices capacity lead to the need for video adaptation techniques, e.g., a reduction of the bit rate or spatial resolution. Video transcoding, which modifies a property of the video without the change of the coding format, has been well-known as an efficient adaptation solution. However, this approach comes along with a high computational complexity, resulting in huge energy consumption in the network and possibly network latency.
This presentation provides several optimization strategies for the transcoding process of HEVC (the latest High Efficiency Video Coding standard) video streams. First, the computational complexity of a bit rate transcoder (transrater) is reduced. We proposed several techniques to speed-up the encoder of a transrater, notably a machine-learning-based approach and a novel coding-mode evaluation strategy have been proposed. Moreover, the motion estimation process of the encoder has been optimized with the use of decision theory and the proposed fast search patterns. Second, the issues and challenges of a spatial transcoder have been solved by using machine-learning algorithms. Thanks to their great performance, the proposed techniques are expected to significantly help HEVC gain popularity in a wide range of modern multimedia applications
Recommended from our members
3D multiple description coding for error resilience over wireless networks
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Mobile communications has gained a growing interest from both customers and service providers alike in the last 1-2 decades. Visual information is used in many application domains such as remote health care, video âon demand, broadcasting, video surveillance etc. In order to enhance the visual effects of digital video content, the depth perception needs to be provided with the actual visual content. 3D video has earned a significant interest from the research community in recent years, due to the tremendous impact it leaves on viewers and its enhancement of the userâs quality of experience (QoE). In the near future, 3D video is likely to be used in most video applications, as it offers a greater sense of immersion and perceptual experience. When 3D video is compressed and transmitted over error prone channels, the associated packet loss leads to visual quality degradation. When a picture is lost or corrupted so severely that the concealment result is not acceptable, the receiver typically pauses video playback and waits for the next INTRA picture to resume decoding. Error propagation caused by employing predictive coding may degrade the video quality severely. There are several ways used to mitigate the effects of such transmission errors. One widely used technique in International Video Coding Standards is error resilience.
The motivation behind this research work is that, existing schemes for 2D colour video compression such as MPEG, JPEG and H.263 cannot be applied to 3D video content. 3D video signals contain depth as well as colour information and are bandwidth demanding, as they require the transmission of multiple high-bandwidth 3D video streams. On the other hand, the capacity of wireless channels is limited and wireless links are prone to various types of errors caused by noise, interference, fading, handoff, error burst and network congestion. Given the maximum bit rate budget to represent the 3D scene, optimal bit-rate allocation between texture and depth information rendering distortion/losses should be minimised. To mitigate the effect of these errors on the perceptual 3D video quality, error resilience video coding needs to be investigated further to offer better quality of experience (QoE) to end users.
This research work aims at enhancing the error resilience capability of compressed 3D video, when transmitted over mobile channels, using Multiple Description Coding (MDC) in order to improve better userâs quality of experience (QoE).
Furthermore, this thesis examines the sensitivity of the human visual system (HVS) when employed to view 3D video scenes. The approach used in this study is to use subjective testing in order to rate peopleâs perception of 3D video under error free and error prone conditions through the use of a carefully designed bespoke questionnaire.Petroleum Technology Development Fund (PTDF