2,622 research outputs found
Efficient MRF Energy Propagation for Video Segmentation via Bilateral Filters
Segmentation of an object from a video is a challenging task in multimedia
applications. Depending on the application, automatic or interactive methods
are desired; however, regardless of the application type, efficient computation
of video object segmentation is crucial for time-critical applications;
specifically, mobile and interactive applications require near real-time
efficiencies. In this paper, we address the problem of video segmentation from
the perspective of efficiency. We initially redefine the problem of video
object segmentation as the propagation of MRF energies along the temporal
domain. For this purpose, a novel and efficient method is proposed to propagate
MRF energies throughout the frames via bilateral filters without using any
global texture, color or shape model. Recently presented bi-exponential filter
is utilized for efficiency, whereas a novel technique is also developed to
dynamically solve graph-cuts for varying, non-lattice graphs in general linear
filtering scenario. These improvements are experimented for both automatic and
interactive video segmentation scenarios. Moreover, in addition to the
efficiency, segmentation quality is also tested both quantitatively and
qualitatively. Indeed, for some challenging examples, significant time
efficiency is observed without loss of segmentation quality.Comment: Multimedia, IEEE Transactions on (Volume:16, Issue: 5, Aug. 2014
Distributed video coding for wireless video sensor networks: a review of the state-of-the-art architectures
Distributed video coding (DVC) is a relatively new video coding architecture originated from two fundamental theorems namely, Slepian–Wolf and Wyner–Ziv. Recent research developments have made DVC attractive for applications in the emerging domain of wireless video sensor networks (WVSNs). This paper reviews the state-of-the-art DVC architectures with a focus on understanding their opportunities and gaps in addressing the operational requirements and application needs of WVSNs
A parallel H.264/SVC encoder for high definition video conferencing
In this paper we present a video encoder specially developed and configured for high definition (HD) video conferencing. This video encoder brings together the following three requirements: H.264/Scalable Video Coding (SVC), parallel encoding on multicore platforms, and parallel-friendly rate control. With the first requirement, a minimum quality of service to every end-user receiver over Internet Protocol networks is guaranteed. With the second one, real-time execution is accomplished and, for this purpose, slice-level parallelism, for the main encoding loop, and block-level parallelism, for the upsampling and interpolation filtering processes, are combined. With the third one, a proper HD video content delivery under certain bit rate and end-to-end delay constraints is ensured. The experimental results prove that the proposed H.264/SVC video encoder is able to operate in real time over a wide range of target bit rates at the expense of reasonable losses in rate-distortion efficiency due to the frame partitioning into slices
Motion Scalability for Video Coding with Flexible Spatio-Temporal Decompositions
PhDThe research presented in this thesis aims to extend the scalability range of the
wavelet-based video coding systems in order to achieve fully scalable coding with a
wide range of available decoding points. Since the temporal redundancy regularly
comprises the main portion of the global video sequence redundancy, the techniques
that can be generally termed motion decorrelation techniques have a central role in
the overall compression performance. For this reason the scalable motion modelling
and coding are of utmost importance, and specifically, in this thesis possible
solutions are identified and analysed.
The main contributions of the presented research are grouped into two
interrelated and complementary topics. Firstly a flexible motion model with rateoptimised
estimation technique is introduced. The proposed motion model is based
on tree structures and allows high adaptability needed for layered motion coding. The
flexible structure for motion compensation allows for optimisation at different stages
of the adaptive spatio-temporal decomposition, which is crucial for scalable coding
that targets decoding on different resolutions. By utilising an adaptive choice of
wavelet filterbank, the model enables high compression based on efficient mode
selection. Secondly, solutions for scalable motion modelling and coding are
developed. These solutions are based on precision limiting of motion vectors and
creation of a layered motion structure that describes hierarchically coded motion.
The solution based on precision limiting relies on layered bit-plane coding of motion
vector values. The second solution builds on recently established techniques that
impose scalability on a motion structure. The new approach is based on two major
improvements: the evaluation of distortion in temporal Subbands and motion search
in temporal subbands that finds the optimal motion vectors for layered motion
structure.
Exhaustive tests on the rate-distortion performance in demanding scalable video
coding scenarios show benefits of application of both developed flexible motion
model and various solutions for scalable motion coding
Region-based representations of image and video: segmentation tools for multimedia services
This paper discusses region-based representations of image and video that are useful for multimedia services such as those supported by the MPEG-4 and MPEG-7 standards. Classical tools related to the generation of the region-based representations are discussed. After a description of the main processing steps and the corresponding choices in terms of feature spaces, decision spaces, and decision algorithms, the state of the art in segmentation is reviewed. Mainly tools useful in the context of the MPEG-4 and MPEG-7 standards are discussed. The review is structured around the strategies used by the algorithms (transition based or homogeneity based) and the decision spaces (spatial, spatio-temporal, and temporal). The second part of this paper proposes a partition tree representation of images and introduces a processing strategy that involves a similarity estimation step followed by a partition creation step. This strategy tries to find a compromise between what can be done in a systematic and universal way and what has to be application dependent. It is shown in particular how a single partition tree created with an extremely simple similarity feature can support a large number of segmentation applications: spatial segmentation, motion estimation, region-based coding, semantic object extraction, and region-based retrieval.Peer ReviewedPostprint (published version
Motion estimation and signaling techniques for 2D+t scalable video coding
We describe a fully scalable wavelet-based 2D+t (in-band) video coding architecture. We propose new coding tools specifically designed for this framework aimed at two goals: reduce the computational complexity at the encoder without sacrificing compression; improve the coding efficiency, especially at low bitrates. To this end, we focus our attention on motion estimation and motion vector encoding. We propose a fast motion estimation algorithm that works in the wavelet domain and exploits the geometrical properties of the wavelet subbands. We show that the computational complexity grows linearly with the size of the search window, yet approaching the performance of a full search strategy. We extend the proposed motion estimation algorithm to work with blocks of variable sizes, in order to better capture local motion characteristics, thus improving in terms of rate-distortion behavior. Given this motion field representation, we propose a motion vector coding algorithm that allows to adaptively scale the motion bit budget according to the target bitrate, improving the coding efficiency at low bitrates. Finally, we show how to optimally scale the motion field when the sequence is decoded at reduced spatial resolution. Experimental results illustrate the advantages of each individual coding tool presented in this paper. Based on these simulations, we define the best configuration of coding parameters and we compare the proposed codec with MC-EZBC, a widely used reference codec implementing the t+2D framework
- …