11 research outputs found
REGION-BASED ADAPTIVE DISTRIBUTED VIDEO CODING CODEC
The recently developed Distributed Video Coding (DVC) is typically suitable for the
applications where the conventional video coding is not feasible because of its
inherent high-complexity encoding. Examples include video surveillance usmg
wireless/wired video sensor network and applications using mobile cameras etc. With
DVC, the complexity is shifted from the encoder to the decoder.
The practical application of DVC is referred to as Wyner-Ziv video coding (WZ)
where an estimate of the original frame called "side information" is generated using
motion compensation at the decoder. The compression is achieved by sending only
that extra information that is needed to correct this estimation. An error-correcting
code is used with the assumption that the estimate is a noisy version of the original
frame and the rate needed is certain amount of the parity bits. The side information is
assumed to have become available at the decoder through a virtual channel. Due to
the limitation of compensation method, the predicted frame, or the side information, is
expected to have varying degrees of success. These limitations stem from locationspecific
non-stationary estimation noise. In order to avoid these, the conventional
video coders, like MPEG, make use of frame partitioning to allocate optimum coder
for each partition and hence achieve better rate-distortion performance. The same,
however, has not been used in DVC as it increases the encoder complexity.
This work proposes partitioning the considered frame into many coding units
(region) where each unit is encoded differently. This partitioning is, however, done at
the decoder while generating the side-information and the region map is sent over to
encoder at very little rate penalty. The partitioning allows allocation of appropriate
DVC coding parameters (virtual channel, rate, and quantizer) to each region. The
resulting regions map is compressed by employing quadtree algorithm and
communicated to the encoder via the feedback channel. The rate control in DVC is
performed by channel coding techniques (turbo codes, LDPC, etc.). The performance
of the channel code depends heavily on the accuracy of virtual channel model that models estimation error for each region. In this work, a turbo code has been used and
an adaptive WZ DVC is designed both in transform domain and in pixel domain. The
transform domain WZ video coding (TDWZ) has distinct superior performance as
compared to the normal Pixel Domain Wyner-Ziv (PDWZ), since it exploits the
'
spatial redundancy during the encoding. The performance evaluations show that the
proposed system is superior to the existing distributed video coding solutions.
Although the, proposed system requires extra bits representing the "regions map" to be
transmitted, fuut still the rate gain is noticeable and it outperforms the state-of-the-art
frame based DVC by 0.6-1.9 dB.
The feedback channel (FC) has the role to adapt the bit rate to the changing
'
statistics between the side infonmation and the frame to be encoded. In the
unidirectional scenario, the encoder must perform the rate control. To correctly
estimate the rate, the encoder must calculate typical side information. However, the
rate cannot be exactly calculated at the encoder, instead it can only be estimated. This
work also prbposes a feedback-free region-based adaptive DVC solution in pixel
domain based on machine learning approach to estimate the side information.
Although the performance evaluations show rate-penalty but it is acceptable
considering the simplicity of the proposed algorithm.
vii
REGION-BASED ADAPTIVE DISTRIBUTED VIDEO CODING CODEC
The recently developed Distributed Video Coding (DVC) is typically suitable for the
applications where the conventional video coding is not feasible because of its
inherent high-complexity encoding. Examples include video surveillance usmg
wireless/wired video sensor network and applications using mobile cameras etc. With
DVC, the complexity is shifted from the encoder to the decoder.
The practical application of DVC is referred to as Wyner-Ziv video coding (WZ)
where an estimate of the original frame called "side information" is generated using
motion compensation at the decoder. The compression is achieved by sending only
that extra information that is needed to correct this estimation. An error-correcting
code is used with the assumption that the estimate is a noisy version of the original
frame and the rate needed is certain amount of the parity bits. The side information is
assumed to have become available at the decoder through a virtual channel. Due to
the limitation of compensation method, the predicted frame, or the side information, is
expected to have varying degrees of success. These limitations stem from locationspecific
non-stationary estimation noise. In order to avoid these, the conventional
video coders, like MPEG, make use of frame partitioning to allocate optimum coder
for each partition and hence achieve better rate-distortion performance. The same,
however, has not been used in DVC as it increases the encoder complexity.
This work proposes partitioning the considered frame into many coding units
(region) where each unit is encoded differently. This partitioning is, however, done at
the decoder while generating the side-information and the region map is sent over to
encoder at very little rate penalty. The partitioning allows allocation of appropriate
DVC coding parameters (virtual channel, rate, and quantizer) to each region. The
resulting regions map is compressed by employing quadtree algorithm and
communicated to the encoder via the feedback channel. The rate control in DVC is
performed by channel coding techniques (turbo codes, LDPC, etc.). The performance
of the channel code depends heavily on the accuracy of virtual channel model that models estimation error for each region. In this work, a turbo code has been used and
an adaptive WZ DVC is designed both in transform domain and in pixel domain. The
transform domain WZ video coding (TDWZ) has distinct superior performance as
compared to the normal Pixel Domain Wyner-Ziv (PDWZ), since it exploits the
'
spatial redundancy during the encoding. The performance evaluations show that the
proposed system is superior to the existing distributed video coding solutions.
Although the, proposed system requires extra bits representing the "regions map" to be
transmitted, fuut still the rate gain is noticeable and it outperforms the state-of-the-art
frame based DVC by 0.6-1.9 dB.
The feedback channel (FC) has the role to adapt the bit rate to the changing
'
statistics between the side infonmation and the frame to be encoded. In the
unidirectional scenario, the encoder must perform the rate control. To correctly
estimate the rate, the encoder must calculate typical side information. However, the
rate cannot be exactly calculated at the encoder, instead it can only be estimated. This
work also prbposes a feedback-free region-based adaptive DVC solution in pixel
domain based on machine learning approach to estimate the side information.
Although the performance evaluations show rate-penalty but it is acceptable
considering the simplicity of the proposed algorithm.
vii
Improving the Rate-Distortion Performance in Distributed Video Coding
Distributed video coding is a coding paradigm, which allows encoding of video frames at a complexity that is substantially lower than that in conventional video coding schemes. This feature makes it suitable for some emerging applications such as wireless surveillance video and mobile camera phones. In distributed video coding, a subset of frames in the video sequence, known as the key frames, are encoded using a conventional intra-frame encoder, such as H264/AVC in the intra mode, and then transmitted to the decoder. The remaining frames, known as the Wyner-Ziv frames, are encoded based on the Wyner-Ziv principle by using the channel codes, such as LDPC codes. In the transform-domain distributed video coding, each Wyner-Ziv frame undergoes a 4x4 block DCT transform and the resulting DCT coefficients are grouped into DCT bands. The bitplaines corresponding to each DCT band are encoded by a channel encoder, for example an LDPCA encoder, one after another. The resulting error-correcting bits are retained in a buffer at the encoder and transmitted incrementally as needed by the decoder. At the decoder, the key frames are first decoded. The decoded key frames are then used to generate a side information frame as an initial estimate of the corresponding Wyner-Ziv frame, usually by employing an interpolation method. The difference between the DCT band in the side information frame and the corresponding one in the Wyner-Ziv frame, referred to as the correlation noise, is often modeled by Laplacian distribution. A soft-input information for each bit in the bitplane is obtained using this correlation noise model and the corresponding DCT band of the side information frame. The channel decoder then uses this soft-input information along with some error-correcting bits sent by the encoder to decode the bitplanes of each DCT band in each of the Wyner-Ziv frames. Hence, an accurate estimation of the correlation noise model parameter(s) and generation of high-quality side information are required for reliable soft-input information for the bitplanes in the decoder, which in turn leads to a more efficient decoding. Consequently, less error-correcting bits need to be transmitted from the encoder to the decoder to decode the bitplanes, leading to a better compression efficiency and rate-distortion performance.
The correlation noise is not stationary and its statistics vary within each Wyner-Ziv frame and within its corresponding DCT bands. Hence, it is difficult to find an accurate model for the correlation noise and estimate its parameters precisely at the decoder. Moreover, in existing schemes the parameters of the correlation noise for each DCT band are estimated before the decoder starts to decode the bitplanes of that DCT band and they are not modified and kept unchanged during decoding process of the bitplanes. Another problem of concern is that, since side information frame is generated in the decoder using the temporal interpolation between the previously decoded frames, the quality of the side information frames is generally poor when the motions between the frames are non-linear. Hence, generating a high-quality side information is a challenging problem.
This thesis is concerned with the study of accurate estimation of correlation noise model parameters and increasing in the quality of the side information from the standpoint of improving the rate-distortion performance in distributed video coding.
A new scheme is proposed for the estimation of the correlation noise parameters wherein the decoder decodes simultaneously all the bitplanes of a DCT band in a Wyner-Ziv frame and then refines the parameters of the correlation noise model of the band in an iterative manner. This process is carried out on an augmented factor graph using a new recursive message passing algorithm, with the side information generated and kept unchanged during the decoding of the Wyner-Ziv frame. Extensive simulations are carried out showing that the proposed decoder leads to an improved rate-distortion performance in comparison to the original DISCOVER codec and in another DVC codec employing side information frame refinement, particularly for video sequences with high motion content.
In the second part of this work, a new algorithm for the generation of the side information is proposed to refine the initial side information frame using the additional information obtained after decoding the previous DCT bands of a Wyner-Ziv frame. The simulations are carried out demonstrating that the proposed algorithm provides a performance superior to that of schemes employing the other side information refinement mechanisms. Finally, it is shown that incorporating the proposed algorithm for refining the side information into the decoder proposed in the first part of the thesis leads to a further improvement in the rate-distortion performance of the DVC codec
Recommended from our members
Intelligent Side Information Generation in Distributed Video Coding
Distributed video coding (DVC) reverses the traditional coding paradigm of complex encoders allied with basic decoding to one where the computational cost is largely incurred by the decoder. This is attractive as the proven theoretical work of Wyner-Ziv (WZ) and Slepian-Wolf (SW) shows that the performance by such a system should be exactly the same as a conventional coder. Despite the solid theoretical foundations, current DVC qualitative and quantitative performance falls short of existing conventional coders and there remain crucial limitations. A key constraint governing DVC performance is the quality of side information (SI), a coarse representation of original video frames which are not available at the decoder. Techniques to generate SI have usually been based on linear motion compensated temporal interpolation (LMCTI), though these do not always produce satisfactory SI quality, especially in sequences exhibiting non-linear motion.
This thesis presents an intelligent higher order piecewise trajectory temporal interpolation (HOPTTI) framework for SI generation with original contributions that afford better SI quality in comparison to existing LMCTI-based approaches. The major elements in this framework are: (i) a cubic trajectory interpolation algorithm model that significantly improves the accuracy of motion vector estimations; (ii) an adaptive overlapped block motion compensation (AOBMC) model which reduces both blocking and overlapping artefacts in the SI emanating from the block matching algorithm; (iii) the development of an empirical mode switching algorithm; and (iv) an intelligent switching mechanism to construct SI by automatically selecting the best macroblock from the intermediate SI generated by HOPTTI and AOBMC algorithms. Rigorous analysis and evaluation confirms that significant quantitative and perceptual improvements in SI quality are achieved with the new framework
Wireless multimedia sensor networks, security and key management
Wireless Multimedia Sensor Networks (WMSNs) have emerged and shifted the focus from the typical scalar wireless sensor networks to networks with multimedia devices that are capable to retrieve video, audio, images, as well as scalar sensor data. WMSNs are able to deliver multimedia content due to the availability of inexpensive CMOS cameras and microphones coupled with the significant progress in distributed signal processing and multimedia source coding techniques.
These mentioned characteristics, challenges, and requirements of designing WMSNs open many research issues and future research directions to develop protocols, algorithms, architectures, devices, and testbeds to maximize the network lifetime while satisfying the quality of service requirements of the various applications. In this thesis dissertation, we outline the design challenges of WMSNs and we give a comprehensive discussion of the proposed architectures and protocols for the different layers of the communication protocol stack for WMSNs along with their open research issues. Also, we conduct a comparison among the existing WMSN hardware and testbeds based on their specifications and features along with complete classification based on their functionalities and capabilities. In addition, we introduce our complete classification for content security and contextual privacy in WSNs. Our focus in this field, after conducting a complete survey in WMSNs and event privacy in sensor networks, and earning the necessary knowledge of programming sensor motes such as Micaz and Stargate and running simulation using NS2, is to design suitable protocols meet the challenging requirements of WMSNs targeting especially the routing and MAC layers, secure the wirelessly exchange of data against external attacks using proper security algorithms: key management and secure routing, defend the network from internal attacks by using a light-weight intrusion detection technique, protect the contextual information from being leaked to unauthorized parties by adapting an event unobservability scheme, and evaluate the performance efficiency and energy consumption of employing the security algorithms over WMSNs
Side information enhancement using an adaptive hash-based genetic algorithm in a Wyner-Ziv context
Side information construction in Wyner-Ziv video coding is a sensible task which strongly influences the final rate-distortion performance of the scheme. This side information is usually generated through an interpolation of the previous and next images. Some of the zones of a scene however, such as the occlusions, cannot be estimated with other frames. In this paper we propose to avoid this problem by sending some hash information for these unpredictable zones of the image. The resulting algorithm is described and tested here. The obtained results show the advantages of using localized hash information for the high error zones in distributed video coding. ©2010 IEEE
Proceedings of the 7th Sound and Music Computing Conference
Proceedings of the SMC2010 - 7th Sound and Music Computing Conference, July 21st - July 24th 2010