16 research outputs found

    Low-complexity wavelet-based scalable image & video coding for home-use surveillance

    Get PDF
    We study scalable image and video coding for the surveillance of rooms and personal environments based on inexpensive cameras and portable devices. The scalability is achieved through a multi-level 2D dyadic wavelet decomposition featuring an accurate low-cost integer wavelet implementation with lifting. As our primary contribution, we present a modification to the SPECK wavelet coefficient encoding algorithm to significantly improve the efficiency of an embedded system implementation. The modification consists of storing the significance of all quadtree nodes in a buffer, where each node comprises several coefficients. This buffer is then used to efficiently construct the code with minimal and direct memory access. Our approach allows efficient parallel implementation on multi-core computer systems and gives a substantial reduction of memory access and thus power consumption. We report experimental results, showing an approximate gain factor of 1,000 in execution time compared to a straightforward SPECK implementation, when combined with code optimization on a common digital signal processor. This translates to 75 full color 4CIF 4:2:0 encoding cycles per second, clearly demonstrating the realtime capabilities of the proposed modification

    Temporal signal energy correction and low-complexity encoder feedback for lossy scalable video coding

    No full text
    In this paper, we address two problems found in embedded implementations of Scalable Video Codecs (SVCs): the temporal signal energy distribution and frame-to-frame quality fluctuations. The unequal energy distribution between the low- and high-pass band with integer-based wavelets leads to sub-optimal rate-distortion choices coupled with quantization-error accumulations. The second problem is the quality fluctuation between frames within a Group Of Pictures (GOP). To solve these two problems, we present two modifications to the SVC. The first modification aims at a temporal energy correction of the lifting scheme in the temporal wavelet decomposition. By moving this energy correction to the leaves of the temporal tree, we can save on required memory size, bandwidth and computations, while reducing floating/fixed-point conversion errors. The second modification feeds back the decoded first frame of the GOP (the temporal low-pass) into the temporal coding chain. The decoding of the first frame is achieved without entropy decoding while avoiding any required modifications at the decoder. Experiments show that quality fluctuations within the GOP are significantly reduced, thereby significantly increasing the subjective visual quality. On top of this, a small quality improvement is achieved on average

    Highly-Parallelized Motion Estimation for Scalable Video Coding

    No full text
    In this paper, we discuss the design of a highly-parallel motion estimator for real-time Scalable Video Coding (SVC). In an SVC, motion is commonly estimated bidirectionally and over various temporal distances. Current motion estimators are optimized for frame-byframe estimation, and such estimators are designed without serious implementation constraints. To support efficient embedded applications, we propose a Highly Parallel Predictive Search (HPPS) motion estimator while preserving an accurate estimation performance. The motion estimation algorithm is optimized for processing on parallel cores and utilizes a novel recursive search strategy. This strategy is based on hierarchically increasing the temporal distance in the estimation algorithm while using the state of the previous hierarchical layer as an input. Due to the absence of local recursions in the algorithm, the proposed motion estimator has a constant computational load, regardless of video activity or temporal distance. We compared our proposed motion estimator to the well-known full search, ARPS3, 3DRS and EPZS motion estimators for the SVC case, and obtain a performance close to full search (0.2dB), while outperforming other algorithms in prediction

    Performance vs. complexity in scalable video coding for embedded surveillance applications

    No full text
    In this paper, we explore the complexity-performance trade-offs for camera surveillance applications. For this purpose, we propose a Scalable Video Codec (SVC), based on wavelet transformation in which we have adopted a t+2D architecture. Complexity is adjusted by adapting the configuration of the lifting-based motion-compensated temporal filtering (MCTF). We discuss various configurations and have found an SVC that has a scalable complexity and performance, enabling embedded applications. The paper discusses the trade-off of coder complexity, e.g. motion-compensation stages, compression efficiency and end-to-end delay of the video coding chain. Our SVC has a lower complexity than H.264 SVC, but the quality performance at full resolution is close to H.264 SVC (within 1 dB for surveillance type video at 4CIF, 60Hz) and at lower resolutions sufficient for our video surveillance application

    Enhanced prediction for motion estimation in scalable video coding

    Get PDF
    In this paper, we present a temporal candidate generation scheme that can be applied to motion estimators in Scalable Video Codecs (SVCs). For bidirectional motion estimation, usually a test is made for each block to determine which motion compensation direction is preferred: forward, bidirectional or backward. Instead of simply using the last computed motion vector field (backward or forward), giving an asymmetry in the estimation, we involve both vector fields to generate a single candidate field for a more stable and improved prediction. This field is generated with the aid of mode decision information of the codec. This single field of motion vector candidates serves two purposes: (1) it initializes the next recursion and (2) it is the foundation for the succeeding scale in the scalable coding. We have implemented this improved candidate system for both HPPS as EPZS motion estimators in a scalable video codec. We have found that it reduces the errors caused by occlusion of moving objects or image boundaries. For EPZS, only a small improvement is observed compared to the simple candidate scheme. However, for HPPS improvements are more significant: when looking at individual levels, motion compensation performance improves by up to 0.84 dB and when implemented in SVC, HPPS slightly outperforms EPZS

    Low-complexity wavelet-based scalable image & video coding for home-use surveillance

    No full text
    We study scalable image and video coding for the surveillance of rooms and personal environments based on inexpensive cameras and portable devices. The scalability is achieved through a multi-level 2D dyadic wavelet decomposition featuring an accurate low-cost integer wavelet implementation with lifting. As our primary contribution, we present a modification to the SPECK wavelet coefficient encoding algorithm to significantly improve the efficiency of an embedded system implementation. The modification consists of storing the significance of all quadtree nodes in a buffer, where each node comprises several coefficients. This buffer is then used to efficiently construct the code with minimal and direct memory access. Our approach allows efficient parallel implementation on multi-core computer systems and gives a substantial reduction of memory access and thus power consumption. We report experimental results, showing an approximate gain factor of 1,000 in execution time compared to a straightforward SPECK implementation, when combined with code optimization on a common digital signal processor. This translates to 75 full color 4CIF 4:2:0 encoding cycles per second, clearly demonstrating the realtime capabilities of the proposed modification

    Real-time multi-level wavelet lifting scheme on a fixed-point dsp for JPEG 2000 and scalable video coding

    No full text
    In this paper, we discuss the design and real-time implementation of a multi-level two-dimensional Discrete Wavelet Transform (2D-DWT). The wavelet transform uses the wellknown 5/3 filter coefficients and is implemented using the lifting framework. However, the transform allows complexityscalable solutions with different latencies for scalable video coding. We have extensively utilized SIMD (Single Instruction Multiple Data) and DMA (Direct Memory Access) techniques, where the proposed process of background DMA transfers is so effective, that the ALUs are almost never starved for data input. The obtained execution performs a 4-level transform at CCIR-601 broadcast resolution in 3.65 Mcycles, including memory stalls, on a DM642 DSP. At a clock rate of 600 MHz this translates to more than 160 transforms per second, satisfying the performance requirements for a real-time image/video encoding system for e.g. surveillance applications. © 2009 IEEE

    Tracking Rectangular Targets in Surveillance Videos with the GM-PHD Filter

    No full text
    This paper describes the application of a Gaussian Mixture Probability Hypoth- esis Density (GM-PHD) ??lter for tracking objects in surveillance video. Clark et al. have proposed a point-based GM-PHD ??lter designed for track label consis- tency. However, this cannot be used for track consistency when using rectangles covering an object. The proposed solution modi??es this ??lter to increase tracking performance when objects split and overlap. Results on synthetic data and real data show that the number of false detections is slightly lower using the rectan- gle GM-PHD, for the same error distance. The advantage is that split objects are better handled (10%-20% lower error distance) by the rectangle GM-PHD ??lter. We conclude that the overall performance is slightly better of the proposed rectangle tracker, but improvements in occlusion handling are required

    Real-time scalable video codec implementation for surveillance

    No full text
    In this paper, we discuss the design and real-time implementation of a Scalable Video Codec (SVC) for surveillance applications. We present a complexity-scalable temporal wavelet transform and the implementation of a multi-level 2D 5/3 wavelet transform, using the lifting framework. We have employed SIMD (Single Instruction Multiple Data) and DMA (Direct Memory Access) techniques, where the proposed process of background DMA transfers is so effective, that the ALUs are always supplied with input data. We have realized the execution of a 4-level transform at 4CIF (CCIR-601) broadcast resolution in 3.65 Mcycles, including memory stalls, on a TMS320DM642 DSP. At a clock rate of 600 MHz, this translates to more than 160 transforms per second. For our complete SVC, we achieve a frame rate of 12.5-15 fps depending on scene activity. ©2009 IEEE

    Tracking Rectangular Targets in Surveillance Videos with the GM-PHD Filter

    No full text
    This paper describes the application of a Gaussian Mixture Probability Hypoth- esis Density (GM-PHD) ??lter for tracking objects in surveillance video. Clark et al. have proposed a point-based GM-PHD ??lter designed for track label consis- tency. However, this cannot be used for track consistency when using rectangles covering an object. The proposed solution modi??es this ??lter to increase tracking performance when objects split and overlap. Results on synthetic data and real data show that the number of false detections is slightly lower using the rectan- gle GM-PHD, for the same error distance. The advantage is that split objects are better handled (10%-20% lower error distance) by the rectangle GM-PHD ??lter. We conclude that the overall performance is slightly better of the proposed rectangle tracker, but improvements in occlusion handling are required
    corecore