67 research outputs found

    Motion correlation based low complexity and low power schemes for video codec

    Get PDF
    制度:新 ; 報告番号:甲3750号 ; 学位の種類:博士(工学) ; 授与年月日:2012/11/19 ; 早大学位記番号:新6121Waseda Universit

    Computational Complexity Optimization on H.264 Scalable/Multiview Video Coding

    Get PDF
    The H.264/MPEG-4 Advanced Video Coding (AVC) standard is a high efficiency and flexible video coding standard compared to previous standards. The high efficiency is achieved by utilizing a comprehensive full search motion estimation method. Although the H.264 standard improves the visual quality at low bitrates, it enormously increases the computational complexity. The research described in this thesis focuses on optimization of the computational complexity on H.264 scalable and multiview video coding. Nowadays, video application areas range from multimedia messaging and mobile to high definition television, and they use different type of transmission systems. The Scalable Video Coding (SVC) extension of the H.264/AVC standard is able to scale the video stream in order to adapt to a variety of devices with different capabilities. Furthermore, a rate control scheme is utilized to improve the visual quality under the constraints of capability and channel bandwidth. However, the computational complexity is increased. A simplified rate control scheme is proposed to reduce the computational complexity. In the proposed scheme, the quantisation parameter can be computed directly instead of using the exhaustive Rate-Quantization model. The linear Mean Absolute Distortion (MAD) prediction model is used to predict the scene change, and the quantisation parameter will be increased directly by a threshold when the scene changes abruptly; otherwise, the comprehensive Rate-Quantisation model will be used. Results show that the optimized rate control scheme is efficient on time saving. Multiview Video Coding (MVC) is efficient on reducing the huge amount of data in multiple-view video coding. The inter-view reference frames from the adjacent views are exploited for prediction in addition to the temporal prediction. However, due to the increase in the number of reference frames, the computational complexity is also increased. In order to manage the reference frame efficiently, a phase correlation algorithm is utilized to remove the inefficient inter-view reference frame from the reference list. The dependency between the inter-view reference frame and current frame is decided based on the phase correlation coefficients. If the inter-view reference frame is highly related to the current frame, it is still enabled in the reference list; otherwise, it will be disabled. The experimental results show that the proposed scheme is efficient on time saving and without loss in visual quality and increase in bitrate. The proposed optimization algorithms are efficient in reducing the computational complexity on H.264/AVC extension. The low computational complexity algorithm is useful in the design of future video coding standards, especially on low power handheld devices

    Recent Advances in Region-of-interest Video Coding

    Get PDF

    H.264 Motion Estimation and Applications

    Get PDF

    Privacy region protection for H.264/AVC with enhanced scrambling effect and a low bitrate overhead

    Get PDF
    While video surveillance systems have become ubiquitous in our daily lives, they have introduced concerns over privacy invasion. Recent research to address these privacy issues includes a focus on privacy region protection, whereby existing video scrambling techniques are applied to specific regions of interest (ROI) in a video while the background is left unchanged. Most previous work in this area has only focussed on encrypting the sign bits of nonzero coefficients in the privacy region, which produces a relatively weak scrambling effect. In this paper, to enhance the scrambling effect for privacy protection, it is proposed to encrypt the intra prediction modes (IPM) in addition to the sign bits of nonzero coefficients (SNC) within the privacy region. A major issue with utilising encryption of IPM is that drift error is introduced outside the region of interest. Therefore, a re-encoding method, which is integrated with the encryption of IPM, is also proposed to remove drift error. Compared with a previous technique that uses encryption of IPM, the proposed re-encoding method offers savings in the bitrate overhead while completely removing the drift error. Experimental results and analysis based on H.264/AVC were carried out to verify the effectiveness of the proposed methods. In addition, a spiral binary mask mechanism is proposed that can reduce the bitrate overhead incurred by flagging the position of the privacy region. A definition of the syntax structure for the spiral binary mask is given. As a result of the proposed techniques, the privacy regions in a video sequence can be effectively protected by the enhanced scrambling effect with no drift error and a lower bitrate overhead.N/

    Surveillance centric coding

    Get PDF
    PhDThe research work presented in this thesis focuses on the development of techniques specific to surveillance videos for efficient video compression with higher processing speed. The Scalable Video Coding (SVC) techniques are explored to achieve higher compression efficiency. The framework of SVC is modified to support Surveillance Centric Coding (SCC). Motion estimation techniques specific to surveillance videos are proposed in order to speed up the compression process of the SCC. The main contributions of the research work presented in this thesis are divided into two groups (i) Efficient Compression and (ii) Efficient Motion Estimation. The paradigm of Surveillance Centric Coding (SCC) is introduced, in which coding aims to achieve bit-rate optimisation and adaptation of surveillance videos for storing and transmission purposes. In the proposed approach the SCC encoder communicates with the Video Content Analysis (VCA) module that detects events of interest in video captured by the CCTV. Bit-rate optimisation and adaptation are achieved by exploiting the scalability properties of the employed codec. Time segments containing events relevant to surveillance application are encoded using high spatiotemporal resolution and quality while the irrelevant portions from the surveillance standpoint are encoded at low spatio-temporal resolution and / or quality. Thanks to the scalability of the resulting compressed bit-stream, additional bit-rate adaptation is possible; for instance for the transmission purposes. Experimental evaluation showed that significant reduction in bit-rate can be achieved by the proposed approach without loss of information relevant to surveillance applications. In addition to more optimal compression strategy, novel approaches to performing efficient motion estimation specific to surveillance videos are proposed and implemented with experimental results. A real-time background subtractor is used to detect the presence of any motion activity in the sequence. Different approaches for selective motion estimation, GOP based, Frame based and Block based, are implemented. In the former, motion estimation is performed for the whole group of pictures (GOP) only when a moving object is detected for any frame of the GOP. iii While for the Frame based approach; each frame is tested for the motion activity and consequently for selective motion estimation. The selective motion estimation approach is further explored at a lower level as Block based selective motion estimation. Experimental evaluation showed that significant reduction in computational complexity can be achieved by applying the proposed strategy. In addition to selective motion estimation, a tracker based motion estimation and fast full search using multiple reference frames has been proposed for the surveillance videos. Extensive testing on different surveillance videos shows benefits of application of proposed approaches to achieve the goals of the SCC

    Modified inter prediction H.264 video encoding for maritime surveillance

    Get PDF
    Video compression has evolved since it is first being standardized. The most popular CODEC, H.264 can compress video effectively according to the quality that is required. This is due to the motion estimation (ME) process that has impressive features like variable block sizes varying from 4×4 to 16×16 and quarter pixel motion compensation. However, the disadvantage of H.264 is that, it is very complex and impractical for hardware implementation. Many efforts have been made to produce low complexity encoding by compromising on the bitrate and decoded quality. Two notable methods are Fast Search Mode and Early Termination. In Early Termination concept, the encoder does not have to perform ME on every macroblock for every block size. If certain criteria are reached, the process could be terminated and the Mode Decision could select the best block size much faster. This project proposes on using background subtraction to maximize the Early Termination process. When recording using static camera, the background remains the same for a long period of time where most macroblocks will produce minimum residual. Thus in this thesis, the ME process for the background macroblock is terminated much earlier using the maximum 16×16 macroblock size. The accuracy of the background segmentation for maritime surveillance video case study is 88.43% and the true foreground rate is at 41.74%. The proposed encoder manages to reduce 73.5% of the encoding time and 80.5% of the encoder complexity. The bitrate of the output is also reduced, in the range of 20%, compared to the H.264 baseline encoder. The results show that the proposed method achieves the objectives of improving the compression rate and the encoding time

    A new video quality metric for compressed video.

    Get PDF
    Video compression enables multimedia applications such as mobile video messaging and streaming, video conferencing and more recently online social video interactions to be possible. Since most multimedia applications are meant for the human observer, measuring perceived video quality during the designing and testing of these applications is important. Performance of existing perceptual video quality measurement techniques is limited due to poor correlation with subjective quality and implementation complexity. Therefore, this thesis presents new techniques for measuring perceived quality of compressed multimedia video using computationally simple and efficient algorithms. A new full reference perceptual video quality metric called the MOSp metric for measuring subjective quality of multimedia video sequences compressed using block-based video coding algorithms is developed. The metric predicts subjective quality of compressed video using the mean squared error between original and compressed sequences, and video content. Factors which influence the visibility of compression-induced distortion such as spatial texture masking, temporal masking and cognition, are considered for quantifying video content. The MOSp metric is simple to implement and can be integrated into block-based video coding algorithms for real time quality estimations. Performance results presented for a variety of multimedia content compressed to a large range of bitrates show that the metric has high correlation with subjective quality and performs better than popular video quality metrics. As an application of the MOSp metric to perceptual video coding, a new MOSpbased mode selection algorithm for a H264/AVC video encoder is developed. Results show that, by integrating the MOSp metric into the mode selection process, it is possible to make coding decisions based on estimated visual quality rather than mathematical error measures and to achieve visual quality gain in content that is identified as visually important by the MOSp metric. The novel algorithms developed in this research work are particularly useful for integrating into block based video encoders such as the H264/AVC standard for making real time visual quality estimations and coding decisions based on estimated visual quality rather than the currently used mathematical error measures

    Detection and representation of moving objects for video surveillance

    Get PDF
    In this dissertation two new approaches have been introduced for the automatic detection of moving objects (such as people and vehicles) in video surveillance sequences. The first technique analyses the original video and exploits spatial and temporal information to find those pixels in the images that correspond to moving objects. The second technique analyses video sequences that have been encoded according to a recent video coding standard (H.264/AVC). As such, only the compressed features are analyzed to find moving objects. The latter technique results in a very fast and accurate detection (up to 20 times faster than the related work). Lastly, we investigated how different XML-based metadata standards can be used to represent information about these moving objects. We proposed the usage of Semantic Web Technologies to combine information described according to different metadata standards

    Reconfigurable Architecture For H.264/avc Variable Block Size Motion Estimation Based On Motion Activity And Adaptive Search Range

    Get PDF
    Motion Estimation (ME) technique plays a key role in the video coding systems to achieve high compression ratios by removing temporal redundancies among video frames. Especially in the newest H.264/AVC video coding standard, ME engine demands large amount of computational capabilities due to its support for wide range of different block sizes for a given macroblock in order to increase accuracy in finding best matching block in the previous frames. We propose scalable architecture for H.264/AVC Variable Block Size (VBS) Motion Estimation with adaptive computing capability to support various search ranges, input video resolutions, and frame rates. Hardware architecture of the proposed ME consists of scalable Sum of Absolute Difference (SAD) arrays which can perform Full Search Block Matching Algorithm (FSBMA) for smaller 4x4 blocks. It is also shown that by predicting motion activity and adaptively adjusting the Search Range (SR) on the reconfigurable hardware platform, the computational cost of ME required for inter-frame encoding in H.264/AVC video coding standard can be reduced significantly. Dynamic Partial Reconfiguration is a unique feature of Field Programmable Gate Arrays (FPGAs) that makes best use of hardware resources and power by allowing adaptive algorithm to be implemented during run-time. We exploit this feature of FPGA to implement the proposed reconfigurable architecture of ME and maximize the architectural benefits through prediction of motion activities in the video sequences ,adaptation of SR during run-time, and fractional ME refinement. The implemented ME architecture can support real time applications at a maximum frequency of 90MHz with multiple reconfigurable regions. iv When compared to reconfiguration of complete design, partial reconfiguration process results in smaller bitstream size which allows FPGA to implement different configurations at higher speed. The proposed architecture has modular structure, regular data flow, and efficient memory organization with lower memory accesses. By increasing the number of active partial reconfigurable modules from one to four, there is a 4 fold increase in data re-use. Also, by introducing adaptive SR reduction algorithm at frame level, the computational load of ME is reduced significantly with only small degradation in PSNR (≤0.1dB)
    corecore