4,009 research outputs found
Scalable video transcoding for mobile communications
Mobile multimedia contents have been introduced in the market and their demand is growing every day due to the increasing number of mobile devices and the possibility to watch them at any moment in any place. These multimedia contents are delivered over different networks that are visualized in mobile terminals with heterogeneous characteristics. To ensure a continuous high quality it is desirable that this multimedia content can be adapted on-the-fly to the transmission constraints and the characteristics of the mobile devices. In general, video contents are compressed to save storage capacity and to reduce the bandwidth required for its transmission. Therefore, if these compressed video streams were compressed using scalable video coding schemes, they would be able to adapt to those heterogeneous networks and a wide range of terminals. Since the majority of the multimedia contents are compressed using H.264/AVC, they cannot benefit from that scalability. This paper proposes a technique to convert an H.264/AVC bitstream without scalability to a scalable bitstream with temporal scalability as part of a scalable video transcoder for mobile communications. The results show that when our technique is applied, the complexity is reduced by 98 % while maintaining coding efficiency
Motion Scalability for Video Coding with Flexible Spatio-Temporal Decompositions
PhDThe research presented in this thesis aims to extend the scalability range of the
wavelet-based video coding systems in order to achieve fully scalable coding with a
wide range of available decoding points. Since the temporal redundancy regularly
comprises the main portion of the global video sequence redundancy, the techniques
that can be generally termed motion decorrelation techniques have a central role in
the overall compression performance. For this reason the scalable motion modelling
and coding are of utmost importance, and specifically, in this thesis possible
solutions are identified and analysed.
The main contributions of the presented research are grouped into two
interrelated and complementary topics. Firstly a flexible motion model with rateoptimised
estimation technique is introduced. The proposed motion model is based
on tree structures and allows high adaptability needed for layered motion coding. The
flexible structure for motion compensation allows for optimisation at different stages
of the adaptive spatio-temporal decomposition, which is crucial for scalable coding
that targets decoding on different resolutions. By utilising an adaptive choice of
wavelet filterbank, the model enables high compression based on efficient mode
selection. Secondly, solutions for scalable motion modelling and coding are
developed. These solutions are based on precision limiting of motion vectors and
creation of a layered motion structure that describes hierarchically coded motion.
The solution based on precision limiting relies on layered bit-plane coding of motion
vector values. The second solution builds on recently established techniques that
impose scalability on a motion structure. The new approach is based on two major
improvements: the evaluation of distortion in temporal Subbands and motion search
in temporal subbands that finds the optimal motion vectors for layered motion
structure.
Exhaustive tests on the rate-distortion performance in demanding scalable video
coding scenarios show benefits of application of both developed flexible motion
model and various solutions for scalable motion coding
Selected topics in video coding and computer vision
Video applications ranging from multimedia communication to computer vision have been extensively studied in the past decades. However, the emergence of new applications continues to raise questions that are only partially answered by existing techniques. This thesis studies three selected topics related to video: intra prediction in block-based video coding, pedestrian detection and tracking in infrared imagery, and multi-view video alignment.;In the state-of-art video coding standard H.264/AVC, intra prediction is defined on the hierarchical quad-tree based block partitioning structure which fails to exploit the geometric constraint of edges. We propose a geometry-adaptive block partitioning structure and a new intra prediction algorithm named geometry-adaptive intra prediction (GAIP). A new texture prediction algorithm named geometry-adaptive intra displacement prediction (GAIDP) is also developed by extending the original intra displacement prediction (IDP) algorithm with the geometry-adaptive block partitions. Simulations on various test sequences demonstrate that intra coding performance of H.264/AVC can be significantly improved by incorporating the proposed geometry adaptive algorithms.;In recent years, due to the decreasing cost of thermal sensors, pedestrian detection and tracking in infrared imagery has become a topic of interest for night vision and all weather surveillance applications. We propose a novel approach for detecting and tracking pedestrians in infrared imagery based on a layered representation of infrared images. Pedestrians are detected from the foreground layer by a Principle Component Analysis (PCA) based scheme using the appearance cue. To facilitate the task of pedestrian tracking, we formulate the problem of shot segmentation and present a graph matching-based tracking algorithm. Simulations with both OSU Infrared Image Database and WVU Infrared Video Database are reported to demonstrate the accuracy and robustness of our algorithms.;Multi-view video alignment is a process to facilitate the fusion of non-synchronized multi-view video sequences for various applications including automatic video based surveillance and video metrology. In this thesis, we propose an accurate multi-view video alignment algorithm that iteratively aligns two sequences in space and time. To achieve an accurate sub-frame temporal alignment, we generalize the existing phase-correlation algorithm to 3-D case. We also present a novel method to obtain the ground-truth of the temporal alignment by using supplementary audio signals sampled at a much higher rate. The accuracy of our algorithm is verified by simulations using real-world sequences
- …