111 research outputs found

    A new fast motion estimation algorithm using hexagonal subsampling pattern and multiple candidates search

    Get PDF
    In this paper we present a fast algorithm to reduce the computational complexity of block motion estimation. The reduction is obtained from the use of a new hexagonal subsampling pattern and the domain decimation method introduced by Cheng and Chan (see Proc. IEEE ICASSP, vol.4, p.2313, 1996). The multiple candidates search method is also introduced to improve the robustness of the algorithm. Computer simulation shows that the performance is very close to that of the full search.published_or_final_versio

    A Motion Estimation based Algorithm for Encoding Time Reduction in HEVC

    Get PDF
    High Efficiency Video Coding (HEVC) is a video compression standard that offers 50% more efficiency at the expense of high encoding time contrasted with the H.264 Advanced Video Coding (AVC) standard. The encoding time must be reduced to satisfy the needs of real-time applications. This paper has proposed the Multi- Level Resolution Vertical Subsampling (MLRVS) algorithm to reduce the encoding time. The vertical subsampling minimizes the number of Sum of Absolute Difference (SAD) computations during the motion estimation process. The complexity reduction algorithm is also used for fast coding the coefficients of the quantised block using a flag decision. Two distinct search patterns are suggested: New Cross Diamond Diamond (NCDD) and New Cross Diamond Hexagonal (NCDH) search patterns, which reduce the time needed to locate the motion vectors. In this paper, the MLRVS algorithm with NCDD and MLRVS algorithm with NCDH search patterns are simulated separately and analyzed. The results show that the encoding time of the encoder is decreased by 55% with MLRVS algorithm using NCDD search pattern and 56% with MLRVS using NCDH search pattern compared to HM16.5 with Test Zone (TZ) search algorithm. These results are achieved with a slight increase in bit rate and negligible deterioration in output video quality

    MPEG-4 Software Video Encoding

    Get PDF
    A Thesis submitted in fulfillment of the requirements of the degree of doctor of Philosophy in the University of LondonThis thesis presents a software model that allows a parallel decomposition of the MPEG-4 video encoder onto shared memory architectures, in order to reduce its total video encoding time. Since a video sequence consists of video objects each of which is likely to have different encoding requirements, the model incorporates a scheduler which (a) always selects the most appropriate video object for encoding and, (b) employs a mechanism for dynamically allocating video objects allocation onto the system processors, based on video object size information. Further spatial video object parallelism is exploited by applying the single program multiple data (SPMD) paradigm within the different modules of the MPEG-4 video encoder. Due to the fact that not all macroblocks have the same processing requirements, the model also introduces a data partition scheme that generates tiles with identical processing requirements. Since, macroblock data dependencies preclude data parallelism at the shape encoder the model also introduces a new mechanism that allows parallelism using a circular pipeline macroblock technique The encoding time depends partly on an encoder’s computational complexity. This thesis also addresses the problem of the motion estimation, as its complexity has a significant impact on the encoder’s complexity. In particular, two fast motion estimation algorithms have been developed for the model which reduce the computational complexity significantly. The thesis includes experimental results on a four processor shared memory platform, Origin200

    A survey on video compression fast block matching algorithms

    Get PDF
    Video compression is the process of reducing the amount of data required to represent digital video while preserving an acceptable video quality. Recent studies on video compression have focused on multimedia transmission, videophones, teleconferencing, high definition television, CD-ROM storage, etc. The idea of compression techniques is to remove the redundant information that exists in the video sequences. Motion compensation predictive coding is the main coding tool for removing temporal redundancy of video sequences and it typically accounts for 50–80% of video encoding complexity. This technique has been adopted by all of the existing International Video Coding Standards. It assumes that the current frame can be locally modelled as a translation of the reference frames. The practical and widely method used to carry out motion compensated prediction is block matching algorithm. In this method, video frames are divided into a set of non-overlapped macroblocks and compared with the search area in the reference frame in order to find the best matching macroblock. This will carry out displacement vectors that stipulate the movement of the macroblocks from one location to another in the reference frame. Checking all these locations is called Full Search, which provides the best result. However, this algorithm suffers from long computational time, which necessitates improvement. Several methods of Fast Block Matching algorithm are developed to reduce the computation complexity. This paper focuses on a survey for two video compression techniques: the first is called the lossless block matching algorithm process, in which the computational time required to determine the matching macroblock of the Full Search is decreased while the resolution of the predicted frames is the same as for the Full Search. The second is called lossy block matching algorithm process, which reduces the computational complexity effectively but the search result's quality is not the same as for the Full Search

    Scalable light field representation and coding

    Get PDF
    This Thesis aims to advance the state-of-the-art in light field representation and coding. In this context, proposals to improve functionalities like light field random access and scalability are also presented. As the light field representation constrains the coding approach to be used, several light field coding techniques to exploit the inherent characteristics of the most popular types of light field representations are proposed and studied, which are normally based on micro-images or sub-aperture-images. To encode micro-images, two solutions are proposed, aiming to exploit the redundancy between neighboring micro-images using a high order prediction model, where the model parameters are either explicitly transmitted or inferred at the decoder, respectively. In both cases, the proposed solutions are able to outperform low order prediction solutions. To encode sub-aperture-images, an HEVC-based solution that exploits their inherent intra and inter redundancies is proposed. In this case, the light field image is encoded as a pseudo video sequence, where the scanning order is signaled, allowing the encoder and decoder to optimize the reference picture lists to improve coding efficiency. A novel hybrid light field representation coding approach is also proposed, by exploiting the combined use of both micro-image and sub-aperture-image representation types, instead of using each representation individually. In order to aid the fast deployment of the light field technology, this Thesis also proposes scalable coding and representation approaches that enable adequate compatibility with legacy displays (e.g., 2D, stereoscopic or multiview) and with future light field displays, while maintaining high coding efficiency. Additionally, viewpoint random access, allowing to improve the light field navigation and to reduce the decoding delay, is also enabled with a flexible trade-off between coding efficiency and viewpoint random access.Esta Tese tem como objetivo avançar o estado da arte em representação e codificação de campos de luz. Neste contexto, sĂŁo tambĂ©m apresentadas propostas para melhorar funcionalidades como o acesso aleatĂłrio ao campo de luz e a escalabilidade. Como a representação do campo de luz limita a abordagem de codificação a ser utilizada, sĂŁo propostas e estudadas vĂĄrias tĂ©cnicas de codificação de campos de luz para explorar as caracterĂ­sticas inerentes aos seus tipos mais populares de representação, que sĂŁo normalmente baseadas em micro-imagens ou imagens de sub-abertura. Para codificar as micro-imagens, sĂŁo propostas duas soluçÔes, visando explorar a redundĂąncia entre micro-imagens vizinhas utilizando um modelo de predição de alta ordem, onde os parĂąmetros do modelo sĂŁo explicitamente transmitidos ou inferidos no decodificador, respetivamente. Em ambos os casos, as soluçÔes propostas sĂŁo capazes de superar as soluçÔes de predição de baixa ordem. Para codificar imagens de sub-abertura, Ă© proposta uma solução baseada em HEVC que explora a inerente redundĂąncia intra e inter deste tipo de imagens. Neste caso, a imagem do campo de luz Ă© codificada como uma pseudo-sequĂȘncia de vĂ­deo, onde a ordem de varrimento Ă© sinalizada, permitindo ao codificador e decodificador otimizar as listas de imagens de referĂȘncia para melhorar a eficiĂȘncia da codificação. TambĂ©m Ă© proposta uma nova abordagem de codificação baseada na representação hĂ­brida do campo de luz, explorando o uso combinado dos tipos de representação de micro-imagem e sub-imagem, em vez de usar cada representação individualmente. A fim de facilitar a rĂĄpida implantação da tecnologia de campo de luz, esta Tese tambĂ©m propĂ”e abordagens escalĂĄveis de codificação e representação que permitem uma compatibilidade adequada com monitores tradicionais (e.g., 2D, estereoscĂłpicos ou multivista) e com futuros monitores de campo de luz, mantendo ao mesmo tempo uma alta eficiĂȘncia de codificação. AlĂ©m disso, o acesso aleatĂłrio de pontos de vista, permitindo melhorar a navegação no campo de luz e reduzir o atraso na descodificação, tambĂ©m Ă© permitido com um equilĂ­brio flexĂ­vel entre eficiĂȘncia de codificação e acesso aleatĂłrio de pontos de vista

    ENHANCED COMPUTATION TIME FOR FAST BLOCK MATCHING ALGORITHM

    Get PDF
    Video compression is the process of reducing the amount of data required to represent digital video while preserving an acceptable video quality. Recent studies on video compression have focused on multimedia transmission, videophones, teleconferencing, high definition television (HDTV), CD-ROM storage, etc. The idea of compression techniques is to remove the redundant information that exists in the video sequences. Motion compensated predictive coding is the main coding tool for removing temporal redundancy of video sequences and it typically accounts for 50-80% of the video encoding complexity. This technique has been adopted by all of the existing international video coding standards. It assumes that the current frame can be locally modelled as a translation of the reference frames. The practical and widely method used to carry out motion compensated prediction is block matching algorithm. In this method, video frames are divided into a set of non-overlapped macroblocks; each target macroblock of the current frame is compared with the search area in the reference frame in order to find the best matching macroblock. This will carry out displacement vectors that stipulate the movement of the macroblocks from one location to another in the reference frame. Checking all these locations is called full Search, which provides the best result. However, this algorithm suffers from long computational time, which necessitates improvement. Several methods of Fast Block Matching algorithm were developed to reduce the computation complexity. This thesis focuses on two classifications: the first is called the lossless block matching algorithm process, in which the computational time required to determine the matching macroblock of the full search is decreased while the resolution of the predicted frames is the same as for the full search. The second is called the lossy block matching algorithm process, which reduces the computational complexity effectively but the search result’s quality is not the same as for the full search

    Local Binary Pattern Approach for Fast Block Based Motion Estimation

    Get PDF
    With the rapid growth of video services on smartphones such as video conferencing, video telephone and WebTV, implementation of video compression on mobile terminal becomes extremely important. However, the low computation capability of mobile devices becomes a bottleneck which calls for low complexity techniques for video coding. This work presents two set of algorithms for reducing the complexity of motion estimation. Binary motion estimation techniques using one-bit and two-bit transforms reduce the computational complexity of matching error criterion, however sometimes generate inaccurate motion vectors. The first set includes two neighborhood matching based algorithms which attempt to reduce computations to only a fraction of other methods. Simulation results demonstrate that full search local binary pattern (FS-LBP) algorithm reconstruct visually more accurate frames compared to full search algorithm (FSA). Its reduced complexity LBP (RC-LBP) version decreases computations significantly to only a fraction of the other methods while maintaining acceptable performance. The second set introduces edge detection approach for partial distortion elimination based on binary patterns. Spiral partial distortion elimination (SpiralPDE) has been proposed in literature which matches the pixel-to-pixel distortion in a predefined manner. Since, the contribution of all the pixels to the distortion function is different, therefore, it is important to analyze and extract these cardinal pixels. The proposed algorithms are called lossless fast full search partial distortion elimination ME based on local binary patterns (PLBP) and lossy edge-detection pixel decimation technique based on local binary patterns (ELBP). PLBP reduces the matching complexity by matching more contributable pixels early by identifying the most diverse pixels in a local neighborhood. ELBP captures the most representative pixels in a block in order of contribution to the distortion function by evaluating whether the individual pixels belong to the edge or background. Experimental results demonstrate substantial reduction in computational complexity of ELBP with only a marginal loss in prediction quality

    Automatic human face detection in color images

    Get PDF
    Automatic human face detection in digital image has been an active area of research over the past decade. Among its numerous applications, face detection plays a key role in face recognition system for biometric personal identification, face tracking for intelligent human computer interface (HCI), and face segmentation for object-based video coding. Despite significant progress in the field in recent years, detecting human faces in unconstrained and complex images remains a challenging problem in computer vision. An automatic system that possesses a similar capability as the human vision system in detecting faces is still a far-reaching goal. This thesis focuses on the problem of detecting human laces in color images. Although many early face detection algorithms were designed to work on gray-scale Images, strong evidence exists to suggest face detection can be done more efficiently by taking into account color characteristics of the human face. In this thesis, we present a complete and systematic face detection algorithm that combines the strengths of both analytic and holistic approaches to face detection. The algorithm is developed to detect quasi-frontal faces in complex color Images. This face class, which represents typical detection scenarios in most practical applications of face detection, covers a wide range of face poses Including all in-plane rotations and some out-of-plane rotations. The algorithm is organized into a number of cascading stages including skin region segmentation, face candidate selection, and face verification. In each of these stages, various visual cues are utilized to narrow the search space for faces. In this thesis, we present a comprehensive analysis of skin detection using color pixel classification, and the effects of factors such as the color space, color classification algorithm on segmentation performance. We also propose a novel and efficient face candidate selection technique that is based on color-based eye region detection and a geometric face model. This candidate selection technique eliminates the computation-intensive step of window scanning often employed In holistic face detection, and simplifies the task of detecting rotated faces. Besides various heuristic techniques for face candidate verification, we developface/nonface classifiers based on the naive Bayesian model, and investigate three feature extraction schemes, namely intensity, projection on face subspace and edge-based. Techniques for improving face/nonface classification are also proposed, including bootstrapping, classifier combination and using contextual information. On a test set of face and nonface patterns, the combination of three Bayesian classifiers has a correct detection rate of 98.6% at a false positive rate of 10%. Extensive testing results have shown that the proposed face detector achieves good performance in terms of both detection rate and alignment between the detected faces and the true faces. On a test set of 200 images containing 231 faces taken from the ECU face detection database, the proposed face detector has a correct detection rate of 90.04% and makes 10 false detections. We have found that the proposed face detector is more robust In detecting in-plane rotated laces, compared to existing face detectors. +D2

    Multidimensional Wavelets and Computer Vision

    Get PDF
    This report deals with the construction and the mathematical analysis of multidimensional nonseparable wavelets and their efficient application in computer vision. In the first part, the fundamental principles and ideas of multidimensional wavelet filter design such as the question for the existence of good scaling matrices and sensible design criteria are presented and extended in various directions. Afterwards, the analytical properties of these wavelets are investigated in some detail. It will turn out that they are especially well-suited to represent (discretized) data as well as large classes of operators in a sparse form - a property that directly yields efficient numerical algorithms. The final part of this work is dedicated to the application of the developed methods to the typical computer vision problems of nonlinear image regularization and the computation of optical flow in image sequences. It is demonstrated how the wavelet framework leads to stable and reliable results for these problems of generally ill-posed nature. Furthermore, all the algorithms are of order O(n) leading to fast processing
    • 

    corecore