Distributed coding of 3D video sources

Jacco L Wielhouwer; Juan P Mendoza

Distributed coding of 3D video sources

Authors: Jacco L Wielhouwer
Juan P Mendoza
Publication date: 11 April 2011
Publisher: Università degli studi Roma Tre
Doi

Abstract

Multimedia communication over wireless networks has generated a lot of research interests in the last years. Limited network bandwidth and the requirement of real-time playback on one hand, and severe impairments of wireless links on the other represent the main challenge. The additional issue has to do with the time-varying nature of wireless links and network heterogeneity, which make the channels between the sender and the clients extremely diverse in their available bandwidths and packet loss ratios. These diverse transmission conditions and bandwidth scarcity require an efficient scalable multimedia compression. Therefore, a robust scalable video coder is needed. Although standard video coders (e.g., H.264) can offer high coding efficiency in the scalable mode, they are very sensitive to packet loss, which results in error propagation. Motivated by its potential applications in distributed sensor networks, video coding, and compressing multi-spectral imagery, there has been a flurry of recent research activities on distributed source coding. Distributed video coding (DVC) has been proposed as a promising new technique because it adopts a completely different coding concept respect to conventional codec shifting the complexity to decoder who has the task to exploit - partly or wholly - the source statistics to achieve efficient compression. This change of paradigm also moves the encoder-decoder complexity balance, allowing the provision of efficient compression solutions with simple encoders and complex decoders. This new coding paradigm is particularly suitable for emerging applications such as wireless video cameras and wireless low-power surveillance networks, disposable video cameras, certain medical applications, sensor networks, multiview image acquisition, networked camcorders, etc., i.e. all those devices that require low-energy or low-power consumption. As mentioned above, Distributed Video Coding is a new video coding approach basedon two major Information Theory results: the Slepian-Wolf and Wyner-Ziv theorems. The Slepian-Wolf theorem and the Wyner-Ziv theorem state that it is possible to separately encode and jointly decode two different sources obtaining a perfect reconstruction at the decoder. The compression efficiency is comparable to conventional predictive coding systems. Although the theoretical foundations of distributed video coding have been established in the 1970s, the design of practical DVC schemes has been proposed only in recent years. A major reason behind these latest developments is related to the evolution of channel coding, in particular Turbo and Low-Density Parity-Check (LDPC) coding, which allow to build the efficient channel codes necessary for DVC. DVC approach can be very interesting when dealing with 3D video source both for stereoscopic video sequence and multi-view video sequence because it allows to design a simple encoder shifting all the computational complexity to the decoder. In this way, multiple cameras do not need to communicate because respect to conventional codec where inter-view and intra-view prediction is accomplished at the encoder, here inter-view and intra-view data are exchanged at the decoder. When dealing with stereoscopic sequences, it is important to take into account all the possible artifacts that corrupt the coding phase. At this aim, an investigation on stereoscopic artifacts and video quality of a 3D distributed video coding system is carried out in this thesis. DVC video quality is estimated by means of subjective and objective evaluations. Then two different techniques for joint source-channel coding in distributed environments are introduced. The first is strictly related on distributed 3D video coding and it is based on turbo code. The second approach considers ad-hoc network with mobile and distributed nodes that acquire multimedia contents and exploit a joint source-channel coding system based on LT code for channel protection and information relaying. Then, a multi-view distributed video coding system based on Zernike moments is analyzed. Specifically a new fusion technique between temporal and spatial side information in Zernike Moments domain is proposed. The main goal of our work is to generate at the decoder the side information that optimally blends temporal and interview data. Multi-view distributed coding performance strongly depends on the side information quality built atthe decoder. At this aim for improving its quality a spatial view compensation/prediction in Zernike moments domain is applied. Spatial and temporal motion activity have been fused together to obtain the overall side-information. The proposed method will be evaluated by rate-distortion performances for different inter-view and temporal estimation quality conditions. Finally, image retrieval techniques in multimedia database are reported. Two methods based on Zernike moments and Laguerre-Gauss Transform are proposed and compared with the state of ar