4,347 research outputs found
Motion Scalability for Video Coding with Flexible Spatio-Temporal Decompositions
PhDThe research presented in this thesis aims to extend the scalability range of the
wavelet-based video coding systems in order to achieve fully scalable coding with a
wide range of available decoding points. Since the temporal redundancy regularly
comprises the main portion of the global video sequence redundancy, the techniques
that can be generally termed motion decorrelation techniques have a central role in
the overall compression performance. For this reason the scalable motion modelling
and coding are of utmost importance, and specifically, in this thesis possible
solutions are identified and analysed.
The main contributions of the presented research are grouped into two
interrelated and complementary topics. Firstly a flexible motion model with rateoptimised
estimation technique is introduced. The proposed motion model is based
on tree structures and allows high adaptability needed for layered motion coding. The
flexible structure for motion compensation allows for optimisation at different stages
of the adaptive spatio-temporal decomposition, which is crucial for scalable coding
that targets decoding on different resolutions. By utilising an adaptive choice of
wavelet filterbank, the model enables high compression based on efficient mode
selection. Secondly, solutions for scalable motion modelling and coding are
developed. These solutions are based on precision limiting of motion vectors and
creation of a layered motion structure that describes hierarchically coded motion.
The solution based on precision limiting relies on layered bit-plane coding of motion
vector values. The second solution builds on recently established techniques that
impose scalability on a motion structure. The new approach is based on two major
improvements: the evaluation of distortion in temporal Subbands and motion search
in temporal subbands that finds the optimal motion vectors for layered motion
structure.
Exhaustive tests on the rate-distortion performance in demanding scalable video
coding scenarios show benefits of application of both developed flexible motion
model and various solutions for scalable motion coding
Online Reinforcement Learning for Dynamic Multimedia Systems
In our previous work, we proposed a systematic cross-layer framework for
dynamic multimedia systems, which allows each layer to make autonomous and
foresighted decisions that maximize the system's long-term performance, while
meeting the application's real-time delay constraints. The proposed solution
solved the cross-layer optimization offline, under the assumption that the
multimedia system's probabilistic dynamics were known a priori. In practice,
however, these dynamics are unknown a priori and therefore must be learned
online. In this paper, we address this problem by allowing the multimedia
system layers to learn, through repeated interactions with each other, to
autonomously optimize the system's long-term performance at run-time. We
propose two reinforcement learning algorithms for optimizing the system under
different design constraints: the first algorithm solves the cross-layer
optimization in a centralized manner, and the second solves it in a
decentralized manner. We analyze both algorithms in terms of their required
computation, memory, and inter-layer communication overheads. After noting that
the proposed reinforcement learning algorithms learn too slowly, we introduce a
complementary accelerated learning algorithm that exploits partial knowledge
about the system's dynamics in order to dramatically improve the system's
performance. In our experiments, we demonstrate that decentralized learning can
perform as well as centralized learning, while enabling the layers to act
autonomously. Additionally, we show that existing application-independent
reinforcement learning algorithms, and existing myopic learning algorithms
deployed in multimedia systems, perform significantly worse than our proposed
application-aware and foresighted learning methods.Comment: 35 pages, 11 figures, 10 table
Layer Selection in Progressive Transmission of Motion-Compensated JPEG2000 Video
MCJ2K (Motion-Compensated JPEG2000) is a video codec based on MCTF (Motion- Compensated Temporal Filtering) and J2K (JPEG2000). MCTF analyzes a sequence of images, generating a collection of temporal sub-bands, which are compressed with J2K. The R/D (Rate-Distortion) performance in MCJ2K is better than the MJ2K (Motion JPEG2000) extension, especially if there is a high level of temporal redundancy. MCJ2K codestreams can be served by standard JPIP (J2K Interactive Protocol) servers, thanks to the use of only J2K standard file formats. In bandwidth-constrained scenarios, an important issue in MCJ2K is determining the amount of data of each temporal sub-band that must be transmitted to maximize the quality of the reconstructions at the client side. To solve this problem, we have proposed two rate-allocation algorithms which provide reconstructions that are progressive in quality. The first, OSLA (Optimized Sub-band Layers Allocation), determines the best progression of quality layers, but is computationally expensive. The second, ESLA (Estimated-Slope sub-band Layers Allocation), is sub-optimal in most cases, but much faster and more convenient for real-time streaming scenarios. An experimental comparison shows that even when a straightforward motion compensation scheme is used, the R/D performance of MCJ2K competitive is compared not only to MJ2K, but also with respect to other standard scalable video codecs
Random Linear Network Coding for Wireless Layered Video Broadcast: General Design Methods for Adaptive Feedback-free Transmission
This paper studies the problem of broadcasting layered video streams over
heterogeneous single-hop wireless networks using feedback-free random linear
network coding (RLNC). We combine RLNC with unequal error protection (UEP) and
our main purpose is twofold. First, to systematically investigate the benefits
of UEP+RLNC layered approach in servicing users with different reception
capabilities. Second, to study the effect of not using feedback, by comparing
feedback-free schemes with idealistic full-feedback schemes. To these ends, we
study `expected percentage of decoded frames' as a key content-independent
performance metric and propose a general framework for calculation of this
metric, which can highlight the effect of key system, video and channel
parameters. We study the effect of number of layers and propose a scheme that
selects the optimum number of layers adaptively to achieve the highest
performance. Assessing the proposed schemes with real H.264 test streams, the
trade-offs among the users' performances are discussed and the gain of adaptive
selection of number of layers to improve the trade-offs is shown. Furthermore,
it is observed that the performance gap between the proposed feedback-free
scheme and the idealistic scheme is very small and the adaptive selection of
number of video layers further closes the gap.Comment: 15 pages, 12 figures, 3 tables, Under 2nd round of review, IEEE
Transactions on Communication
- …