2,761 research outputs found
Representation and coding of 3D video data
Livrable D4.1 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D4.1 du projet
Shape representation and coding of visual objets in multimedia applications — An overview
Emerging multimedia applications have created the need for new functionalities in digital communications. Whereas existing compression standards only deal with the audio-visual scene at a frame level, it is now necessary to handle individual objects separately, thus allowing scalable transmission as well as interactive scene recomposition by the receiver. The future MPEG-4 standard aims at providing compression tools addressing these functionalities. Unlike existing frame-based standards, the corresponding coding schemes need to encode shape information explicitly. This paper reviews existing solutions to the problem of shape representation and coding. Region and contour coding techniques are presented and their performance is discussed, considering coding efficiency and rate-distortion control capability, as well as flexibility to application requirements such as progressive transmission, low-delay coding, and error robustnes
Energy efficient enabling technologies for semantic video processing on mobile devices
Semantic object-based processing will play an increasingly important role in future multimedia systems due to the ubiquity of digital multimedia capture/playback technologies and increasing storage capacity. Although the object based paradigm has many undeniable benefits, numerous technical challenges remain before the applications becomes pervasive, particularly on computational constrained mobile devices. A fundamental issue is the ill-posed problem of semantic object segmentation. Furthermore, on battery powered mobile computing devices, the additional algorithmic complexity of semantic object based processing compared to conventional video processing is highly undesirable both from a real-time operation and battery life perspective. This
thesis attempts to tackle these issues by firstly constraining the solution space and focusing on the
human face as a primary semantic concept of use to users of mobile devices. A novel face detection algorithm is proposed, which from the outset was designed to be amenable to be offloaded from the host microprocessor to dedicated hardware, thereby providing real-time performance and
reducing power consumption. The algorithm uses an Artificial Neural Network (ANN), whose topology and weights are evolved via a genetic algorithm (GA). The computational burden of the ANN evaluation is offloaded to a dedicated hardware accelerator, which is capable of processing
any evolved network topology. Efficient arithmetic circuitry, which leverages modified Booth recoding, column compressors and carry save adders, is adopted throughout the design. To tackle the increased computational costs associated with object tracking or object based shape encoding, a novel energy efficient binary motion estimation architecture is proposed. Energy is reduced in the proposed motion estimation architecture by minimising the redundant operations inherent in the binary data. Both architectures are shown to compare favourable with the relevant prior art
Centralized and distributed semi-parametric compression of piecewise smooth functions
This thesis introduces novel wavelet-based semi-parametric centralized and distributed
compression methods for a class of piecewise smooth functions. Our proposed compression schemes are based on a non-conventional transform coding structure with simple
independent encoders and a complex joint decoder.
Current centralized state-of-the-art compression schemes are based on the conventional structure where an encoder is relatively complex and nonlinear. In addition, the
setting usually allows the encoder to observe the entire source. Recently, there has been
an increasing need for compression schemes where the encoder is lower in complexity
and, instead, the decoder has to handle more computationally intensive tasks. Furthermore, the setup may involve multiple encoders, where each one can only partially
observe the source. Such scenario is often referred to as distributed source coding.
In the first part, we focus on the dual situation of the centralized compression where
the encoder is linear and the decoder is nonlinear. Our analysis is centered around a
class of 1-D piecewise smooth functions. We show that, by incorporating parametric
estimation into the decoding procedure, it is possible to achieve the same distortion-
rate performance as that of a conventional wavelet-based compression scheme. We also
present a new constructive approach to parametric estimation based on the sampling
results of signals with finite rate of innovation.
The second part of the thesis focuses on the distributed compression scenario, where
each independent encoder partially observes the 1-D piecewise smooth function. We
propose a new wavelet-based distributed compression scheme that uses parametric estimation to perform joint decoding. Our distortion-rate analysis shows that it is possible
for the proposed scheme to achieve that same compression performance as that of a
joint encoding scheme.
Lastly, we apply the proposed theoretical framework in the context of distributed
image and video compression. We start by considering a simplified model of the video
signal and show that we can achieve distortion-rate performance close to that of a joint
encoding scheme. We then present practical compression schemes for real world signals.
Our simulations confirm the improvement in performance over classical schemes, both
in terms of the PSNR and the visual quality
- …