114 research outputs found
Robust and efficient video/image transmission
The Internet has become a primary medium for information transmission. The unreliability of channel conditions, limited channel bandwidth and explosive growth of information transmission requests, however, hinder its further development. Hence, research on robust and efficient delivery of video/image content is demanding nowadays.
Three aspects of this task, error burst correction, efficient rate allocation and random error protection are investigated in this dissertation. A novel technique, called successive packing, is proposed for combating multi-dimensional (M-D) bursts of errors. A new concept of basis interleaving array is introduced. By combining different basis arrays, effective M-D interleaving can be realized. It has been shown that this algorithm can be implemented only once and yet optimal for a set of error bursts having different sizes for a given two-dimensional (2-D) array.
To adapt to variable channel conditions, a novel rate allocation technique is proposed for FineGranular Scalability (FGS) coded video, in which real data based rate-distortion modeling is developed, constant quality constraint is adopted and sliding window approach is proposed to adapt to the variable channel conditions. By using the proposed technique, constant quality is realized among frames by solving a set of linear functions. Thus, significant computational simplification is achieved compared with the state-of-the-art techniques. The reduction of the overall distortion is obtained at the same time. To combat the random error during the transmission, an unequal error protection (UEP) method and a robust error-concealment strategy are proposed for scalable coded video bitstreams
Layered Wyner-Ziv video coding: a new approach to video compression and delivery
Following recent theoretical works on successive Wyner-Ziv coding, we propose
a practical layered Wyner-Ziv video coder using the DCT, nested scalar quantiza-
tion, and irregular LDPC code based Slepian-Wolf coding (or lossless source coding
with side information at the decoder). Our main novelty is to use the base layer
of a standard scalable video coder (e.g., MPEG-4/H.26L FGS or H.263+) as the
decoder side information and perform layered Wyner-Ziv coding for quality enhance-
ment. Similar to FGS coding, there is no performance di®erence between layered and
monolithic Wyner-Ziv coding when the enhancement bitstream is generated in our
proposed coder. Using an H.26L coded version as the base layer, experiments indicate
that Wyner-Ziv coding gives slightly worse performance than FGS coding when the
channel (for both the base and enhancement layers) is noiseless. However, when the
channel is noisy, extensive simulations of video transmission over wireless networks
conforming to the CDMA2000 1X standard show that H.26L base layer coding plus
Wyner-Ziv enhancement layer coding are more robust against channel errors than
H.26L FGS coding. These results demonstrate that layered Wyner-Ziv video coding
is a promising new technique for video streaming over wireless networks.
For scalable video transmission over the Internet and 3G wireless networks, we
propose a system for receiver-driven layered multicast based on layered Wyner-Ziv video coding and digital fountain coding. Digital fountain codes are near-capacity
erasure codes that are ideally suited for multicast applications because of their rate-
less property. By combining an error-resilient Wyner-Ziv video coder and rateless
fountain codes, our system allows reliable multicast of high-quality video to an arbi-
trary number of heterogeneous receivers without the requirement of feedback chan-
nels. Extending this work on separate source-channel coding, we consider distributed
joint source-channel coding by using a single channel code for both video compression
(via Slepian-Wolf coding) and packet loss protection. We choose Raptor codes - the
best approximation to a digital fountain - and address in detail both encoder and de-
coder designs. Simulation results show that, compared to one separate design using
Slepian-Wolf compression plus erasure protection and another based on FGS coding
plus erasure protection, the proposed joint design provides better video quality at the
same number of transmitted packets
MP3D: Highly Scalable Video Coding Scheme Based on Matching Pursuit
This paper describes a novel video coding scheme based on a three-dimensional Matching Pursuit algorithm. In addition to good compression performance at low bit rate, the proposed coder allows for flexible spatial, temporal and rate scalability thanks to its progressive coding structure. The Matching Pursuit algorithm generates a sparse decomposition of a video sequence in a series of spatio-temporal atoms, taken from an overcomplete dictionary of three-dimensional basis functions. The dictionary is generated by shifting, scaling and rotating two different mother atoms in order to cover the whole frequency cube. An embedded stream is then produced from the series of atoms. They are first distributed into sets through the set-partitioned position map algorithm (SPPM) to form the index-map, inspired from bit plane encoding. Scalar quantization is then applied to the coefficients which are finally arithmetic coded. A complete MP3D codec has been implemented, and performances are shown to favorably compare to other scalable coders like MPEG-4 FGS and SPIHT-3D. In addition, the MP3D streams offer an incomparable flexibility for multiresolution streaming or adaptive decoding
Efficient algorithms for scalable video coding
A scalable video bitstream specifically designed for the needs of various client terminals,
network conditions, and user demands is much desired in current and future video transmission
and storage systems. The scalable extension of the H.264/AVC standard (SVC) has
been developed to satisfy the new challenges posed by heterogeneous environments, as
it permits a single video stream to be decoded fully or partially with variable quality, resolution,
and frame rate in order to adapt to a specific application. This thesis presents
novel improved algorithms for SVC, including: 1) a fast inter-frame and inter-layer coding
mode selection algorithm based on motion activity; 2) a hierarchical fast mode selection
algorithm; 3) a two-part Rate Distortion (RD) model targeting the properties of different
prediction modes for the SVC rate control scheme; and 4) an optimised Mean Absolute
Difference (MAD) prediction model.
The proposed fast inter-frame and inter-layer mode selection algorithm is based on the
empirical observation that a macroblock (MB) with slow movement is more likely to be
best matched by one in the same resolution layer. However, for a macroblock with fast
movement, motion estimation between layers is required. Simulation results show that
the algorithm can reduce the encoding time by up to 40%, with negligible degradation in
RD performance.
The proposed hierarchical fast mode selection scheme comprises four levels and makes
full use of inter-layer, temporal and spatial correlation aswell as the texture information of
each macroblock. Overall, the new technique demonstrates the same coding performance
in terms of picture quality and compression ratio as that of the SVC standard, yet produces
a saving in encoding time of up to 84%. Compared with state-of-the-art SVC fast mode
selection algorithms, the proposed algorithm achieves a superior computational time reduction
under very similar RD performance conditions.
The existing SVC rate distortion model cannot accurately represent the RD properties of
the prediction modes, because it is influenced by the use of inter-layer prediction. A separate
RD model for inter-layer prediction coding in the enhancement layer(s) is therefore
introduced. Overall, the proposed algorithms improve the average PSNR by up to 0.34dB
or produce an average saving in bit rate of up to 7.78%. Furthermore, the control accuracy
is maintained to within 0.07% on average.
As aMADprediction error always exists and cannot be avoided, an optimisedMADprediction
model for the spatial enhancement layers is proposed that considers the MAD from
previous temporal frames and previous spatial frames together, to achieve a more accurateMADprediction.
Simulation results indicate that the proposedMADprediction model
reduces the MAD prediction error by up to 79% compared with the JVT-W043 implementation
Layered Wyner-Ziv video coding for noisy channels
The growing popularity of video sensor networks and video celluar phones has generated the need for low-complexity and power-efficient multimedia systems that can handle multiple video input and output streams. While standard video coding techniques fail to satisfy these requirements, distributed source coding is a promising technique for ??uplink?? applications. Wyner-Ziv coding refers to lossy source coding with side information at the decoder. Based on recent theoretical result on successive Wyner-Ziv coding, we propose in this thesis a practical layered Wyner-Ziv video codec using the DCT, nested scalar quantizer, and irregular LDPC code based Slepian-Wolf coding (or lossless source coding with side information) for noiseless channel. The DCT is applied as an approximation to the conditional KLT, which makes the components of the transformed block conditionally independent given the side information. NSQ is a binning scheme that facilitates layered bit-plane coding of the bin indices while reducing the bit rate. LDPC code based Slepian-Wolf coding exploits the correlation between the quantized version of the source and the side information to achieve further compression. Different from previous works, an attractive feature of our proposed system is that video encoding is done only once but decoding allowed at many lower bit rates without quality loss. For Wyner-Ziv coding over discrete noisy channels, we present a Wyner-Ziv video codec using IRA codes for Slepian-Wolf coding based on the idea of two equivalent channels. For video streaming applications where the channel is packet based, we apply unequal error protection scheme to the embedded Wyner-Ziv coded video stream to find the optimal source-channel coding
trade-off for a target transmission rate over packet erasure channel
DYNAMIC RESOURCE ALLOCATION FOR MULTIUSER VIDEO STREAMING
With the advancement of video compression technology and wide deployment of wired/wireless networks, there is an increasing demand of multiuser video communication services. A multiuser video transmission system should consider not only the reconstructed video quality in the individual-user level but also the service objectives among all users on the network level. There are many design challenges to support multiuser video communication services, such as fading channels, limited radio resources of wireless networks, heterogeneity of video content complexity, delay and decoding dependency constraints of video bitstreams, and mixed integer optimization. To overcome these challenges, a general strategy is to dynamically allocate resources according to the changing environments and requirements, so as to improve the overall system performance and ensure quality of service (QoS) for each user.
In this dissertation, we address the aforementioned design challenges from a resource-allocation point of view and two aspects of system and algorithm designs, namely, a cross-layer design that jointly optimizes resource utilization from physical layer to application layer, and multiuser diversity that explores the source and channel heterogeneity among different users. We also address the impacts on systems caused by dynamic environment along time domain and consider the time-heterogeneity of video sources and time-varying characteristics of channel conditions. To achieve the desired service objectives, a general resource allocation framework is formulated in terms of constrained optimization problems to dynamically allocate resources and control the quality of multiple video bitstreams.
Based on the design methodology of multiuser cross-layer optimization, we propose several systems to efficiently transmit multiple video streams, encoded by current and emerging video codecs, over major types of wireless networks such as 3G cellular system, Wireless Local Area Network, 4G cellular system, and future Wireless Metropolitan Area Networks. Owing to the integer nature of some system parameters, the formulated optimization problems are often integer or mixed integer programming problem and involve high computation to search the optimal solutions. Fast algorithms are proposed to provide real-time services. We demonstrate the advantages of dynamic and joint resource allocation for multiple video sources compared to static strategy. We also show the improvement of exploring diversity on frequency, time, and transmission path, and the benefits from multiuser cross-layer optimization
- …