50 research outputs found
Motion Scalability for Video Coding with Flexible Spatio-Temporal Decompositions
PhDThe research presented in this thesis aims to extend the scalability range of the
wavelet-based video coding systems in order to achieve fully scalable coding with a
wide range of available decoding points. Since the temporal redundancy regularly
comprises the main portion of the global video sequence redundancy, the techniques
that can be generally termed motion decorrelation techniques have a central role in
the overall compression performance. For this reason the scalable motion modelling
and coding are of utmost importance, and specifically, in this thesis possible
solutions are identified and analysed.
The main contributions of the presented research are grouped into two
interrelated and complementary topics. Firstly a flexible motion model with rateoptimised
estimation technique is introduced. The proposed motion model is based
on tree structures and allows high adaptability needed for layered motion coding. The
flexible structure for motion compensation allows for optimisation at different stages
of the adaptive spatio-temporal decomposition, which is crucial for scalable coding
that targets decoding on different resolutions. By utilising an adaptive choice of
wavelet filterbank, the model enables high compression based on efficient mode
selection. Secondly, solutions for scalable motion modelling and coding are
developed. These solutions are based on precision limiting of motion vectors and
creation of a layered motion structure that describes hierarchically coded motion.
The solution based on precision limiting relies on layered bit-plane coding of motion
vector values. The second solution builds on recently established techniques that
impose scalability on a motion structure. The new approach is based on two major
improvements: the evaluation of distortion in temporal Subbands and motion search
in temporal subbands that finds the optimal motion vectors for layered motion
structure.
Exhaustive tests on the rate-distortion performance in demanding scalable video
coding scenarios show benefits of application of both developed flexible motion
model and various solutions for scalable motion coding
Advanced heterogeneous video transcoding
PhDVideo transcoding is an essential tool to promote inter-operability
between different video communication systems. This thesis presents
two novel video transcoders, both operating on bitstreams of the cur-
rent H.264/AVC standard. The first transcoder converts H.264/AVC
bitstreams to a Wavelet Scalable Video Codec (W-SVC), while the second targets the emerging High Efficiency Video Coding (HEVC).
Scalable Video Coding (SVC) enables low complexity adaptation
of compressed video, providing an efficient solution for content delivery
through heterogeneous networks. The transcoder proposed here aims at
exploiting the advantages offered by SVC technology when dealing with
conventional coders and legacy video, efficiently reusing information
found in the H.264/AVC bitstream to achieve a high rate-distortion
performance at a low complexity cost. Its main features include new
mode mapping algorithms that exploit the W-SVC larger macroblock
sizes, and a new state-of-the-art motion vector composition algorithm
that is able to tackle different coding configurations in the H.264/AVC
bitstream, including IPP or IBBP with multiple reference frames.
The emerging video coding standard, HEVC, is currently approaching the final stage of development prior to standardization. This thesis
proposes and evaluates several transcoding algorithms for the HEVC
codec. In particular, a transcoder based on a new method that is capable of complexity scalability, trading off rate-distortion performance
for complexity reduction, is proposed. Furthermore, other transcoding solutions are explored, based on a novel content-based modeling
approach, in which the transcoder adapts its parameters based on the
contents of the sequence being encoded.
Finally, the application of this research is not constrained to these
transcoders, as many of the techniques developed aim to contribute
to advance the research on this field, and have the potential to be
incorporated in different video transcoding architectures
A DWT based perceptual video coding framework: concepts, issues and techniques
The work in this thesis explore the DWT based video coding by the introduction of a novel DWT (Discrete Wavelet Transform) / MC (Motion Compensation) / DPCM (Differential Pulse Code Modulation) video coding framework, which adopts the EBCOT as the coding engine for both the intra- and the inter-frame coder. The adaptive switching mechanism between the frame/field coding modes is investigated for this coding framework. The Low-Band-Shift (LBS) is employed for the MC in the DWT domain. The LBS based MC is proven to provide consistent improvement on the Peak Signal-to-Noise Ratio (PSNR) of the coded video over the simple Wavelet Tree (WT) based MC. The Adaptive Arithmetic Coding (AAC) is adopted to code the motion information. The context set of the Adaptive Binary Arithmetic Coding (ABAC) for the inter-frame data is redesigned based on the statistical analysis. To further improve the perceived picture quality, a Perceptual Distortion Measure (PDM) based on human vision model is used for the EBCOT of the intra-frame coder. A visibility assessment of the quantization error of various subbands in the DWT domain is performed through subjective tests. In summary, all these findings have solved the issues originated from the proposed perceptual video coding framework. They include: a working DWT/MC/DPCM video coding framework with superior coding efficiency on sequences with translational or head-shoulder motion; an adaptive switching mechanism between frame and field coding mode; an effective LBS based MC scheme in the DWT domain; a methodology of the context design for entropy coding of the inter-frame data; a PDM which replaces the MSE inside the EBCOT coding engine for the intra-frame coder, which provides improvement on the perceived quality of intra-frames; a visibility assessment to the quantization errors in the DWT domain
Efficient compression of motion compensated residuals
EThOS - Electronic Theses Online ServiceGBUnited Kingdo
Algorithms and Architectures for Secure Embedded Multimedia Systems
Embedded multimedia systems provide real-time video support for applications in entertainment (mobile phones, internet video websites), defense (video-surveillance and tracking) and public-domain (tele-medicine, remote and distant learning, traffic monitoring and management). With the widespread deployment of such real-time embedded systems, there has been an increasing concern over the security and authentication of concerned multimedia data.
While several (software) algorithms and hardware architectures have been proposed in the research literature to support multimedia security, these fail to address embedded applications whose performance specifications have tighter constraints on computational power and available hardware resources.
The goals of this dissertation research are two fold:
1. To develop novel algorithms for joint video compression and encryption. The proposed algorithms reduce the computational requirements of multimedia encryption algorithms. We propose an approach that uses the compression parameters instead of compressed bitstream for video encryption.
2. Hardware acceleration of proposed algorithms over reconfigurable computing platforms such as FPGA and
over VLSI circuits. We use signal processing knowledge to make the algorithms suitable for hardware optimizations and try to reduce the critical path of circuits using hardware-specific optimizations.
The proposed algorithms ensures a considerable level of security for low-power embedded systems such as portable video players and surveillance cameras. These schemes have zero or little compression losses and preserve the desired properties of compressed bitstream in encrypted bitstream to ensure secure
and scalable transmission of videos over heterogeneous networks.
They also support indexing, search and retrieval in secure multimedia digital libraries. This property is crucial not only for police and armed forces to retrieve information about a suspect from a large video database of surveillance feeds, but extremely helpful for data centers (such as those used by youtube, aol and metacafe) in reducing the computation cost in search and retrieval of desired videos
Multimedia over wireless ip networks:distortion estimation and applications.
2006/2007This thesis deals with multimedia communication over unreliable and resource
constrained IP-based packet-switched networks. The focus is on estimating, evaluating
and enhancing the quality of streaming media services with particular regard
to video services. The original contributions of this study involve mainly the
development of three video distortion estimation techniques and the successive
definition of some application scenarios used to demonstrate the benefits obtained
applying such algorithms. The material presented in this dissertation is the result
of the studies performed within the Telecommunication Group of the Department
of Electronic Engineering at the University of Trieste during the course of Doctorate
in Information Engineering.
In recent years multimedia communication over wired and wireless packet based
networks is exploding. Applications such as BitTorrent, music file sharing, multimedia
podcasting are the main source of all traffic on the Internet. Internet radio
for example is now evolving into peer to peer television such as CoolStreaming.
Moreover, web sites such as YouTube have made publishing videos on demand
available to anyone owning a home video camera. Another challenge in the multimedia
evolution is inside the house where videos are distributed over local WiFi
networks to many end devices around the house. More in general we are assisting
an all media over IP revolution, with radio, television, telephony and stored media
all being delivered over IP wired and wireless networks. All the presented applications
require an extreme high bandwidth and often a low delay especially for
interactive applications. Unfortunately the Internet and the wireless networks provide
only limited support for multimedia applications. Variations in network conditions
can have considerable consequences for real-time multimedia applications
and can lead to unsatisfactory user experience. In fact, multimedia applications
are usually delay sensitive, bandwidth intense and loss tolerant applications. In order
to overcame this limitations, efficient adaptation mechanism must be derived
to bridge the application requirements with the transport medium characteristics.
Several approaches have been proposed for the robust transmission of multimedia
packets; they range from source coding solutions to the addition of redundancy with forward error correction and retransmissions. Additionally, other techniques
are based on developing efficient QoS architectures at the network layer or at the
data link layer where routers or specialized devices apply different forwarding
behaviors to packets depending on the value of some field in the packet header.
Using such network architecture, video packets are assigned to classes, in order
to obtain a different treatment by the network; in particular, packets assigned to
the most privileged class will be lost with a very small probability, while packets
belonging to the lowest priority class will experience the traditional best–effort
service. But the key problem in this solution is how to assign optimally video
packets to the network classes. One way to perform the assignment is to proceed
on a packet-by-packet basis, to exploit the highly non-uniform distortion impact
of compressed video. Working on the distortion impact of each individual video
packet has been shown in recent years to deliver better performance than relying
on the average error sensitivity of each bitstream element. The distortion impact
of a video packet can be expressed as the distortion that would be introduced at
the receiver by its loss, taking into account the effects of both error concealment
and error propagation due to temporal prediction.
The estimation algorithms proposed in this dissertation are able to reproduce accurately
the distortion envelope deriving from multiple losses on the network and
the computational complexity required is negligible in respect to those proposed in
literature. Several tests are run to validate the distortion estimation algorithms and
to measure the influence of the main encoder-decoder settings. Different application scenarios are described and compared to demonstrate the benefits obtained
using the developed algorithms. The packet distortion impact is inserted in each
video packet and transmitted over the network where specialized agents manage
the video packets using the distortion information. In particular, the internal structure of the agents is modified to allow video packets prioritization using primarily
the distortion impact estimated by the transmitter. The results obtained will show
that, in each scenario, a significant improvement may be obtained with respect to
traditional transmission policies.
The thesis is organized in two parts. The first provides the background material
and represents the basics of the following arguments, while the other is dedicated
to the original results obtained during the research activity.
Referring to the first part in the first chapter it summarized an introduction to
the principles and challenges for the multimedia transmission over packet networks.
The most recent advances in video compression technologies are detailed
in the second chapter, focusing in particular on aspects that involve the resilience
to packet loss impairments. The third chapter deals with the main techniques
adopted to protect the multimedia flow for mitigating the packet loss corruption due to channel failures. The fourth chapter introduces the more recent advances in
network adaptive media transport detailing the techniques that prioritize the video
packet flow. The fifth chapter makes a literature review of the existing distortion
estimation techniques focusing mainly on their limitation aspects.
The second part of the thesis describes the original results obtained in the modelling
of the video distortion deriving from the transmission over an error prone
network. In particular, the sixth chapter presents three new distortion estimation
algorithms able to estimate the video quality and shows the results of some validation
tests performed to measure the accuracy of the employed algorithms. The
seventh chapter proposes different application scenarios where the developed algorithms may be used to enhance quickly the video quality at the end user side.
Finally, the eight chapter summarizes the thesis contributions and remarks the
most important conclusions. It also derives some directions for future improvements.
The intent of the entire work presented hereafter is to develop some video distortion
estimation algorithms able to predict the user quality deriving from the loss on the network as well as providing the results of some useful applications able to enhance the user experience during a video streaming session.Questa tesi di dottorato affronta il problema della trasmissione efficiente di contenuti
multimediali su reti a pacchetto inaffidabili e con limitate risorse di banda.
L’obiettivo è quello di ideare alcuni algoritmi in grado di predire l’andamento
della qualità del video ricevuto da un utente e successivamente ideare alcune tecniche in grado di migliorare l’esperienza dell’utente finale nella fruizione dei servizi video. In particolare i contributi originali del presente lavoro riguardano lo sviluppo di algoritmi per la stima della distorsione e l’ideazione di alcuni scenari applicativi in molto frequenti dove poter valutare i benefici ottenibili applicando gli algoritmi di stima.
I contributi presentati in questa tesi di dottorato sono il risultato degli studi compiuti con il gruppo di Telecomunicazioni del Dipartimento di Elettrotecnica Elettronica ed Informatica (DEEI) dell’Università degli Studi di Trieste durante il corso di dottorato in Ingegneria dell’Informazione.
Negli ultimi anni la multimedialità , diffusa sulle reti cablate e wireless, sta diventando
parte integrante del modo di utilizzare la rete diventando di fatto il fenomeno più imponente. Applicazioni come BitTorrent, la condivisione di file musicali e multimediali e il podcasting ad esempio costituiscono una parte significativa del traffico attuale su Internet. Quelle che negli ultimi anni erano le prime radio che trsmettevano sulla rete oggi si stanno evolvendo nei sistemi peer
to peer per più avanzati per la diffusione della TV via web come CoolStreaming.
Inoltre siti web come YouTube hanno costruito il loro business sulla memorizzazione/
distribuzione di video creati da chiunque abbia una semplice video camera.
Un’altra caratteristica dell’imponente rivoluzione multimediale a cui stiamo
assistendo è la diffusione dei video anche all’interno delle case dove i contenuti
multimediali vengono distribuiti mediante delle reti wireless locali tra i vari dispositivi finali. Tutt’oggi è in corso una rivoluzione della multimedialità sulle reti
IP con le radio, i televisioni, la telefonia e tutti i video che devono essere distribuiti
sulle reti cablate e wireless verso utenti eterogenei. In generale la gran parte delle
applicazioni multimediali richiedono una banda elevata e dei ritardi molto contenuti specialmente se le applicazioni sono di tipo interattivo. Sfortunatamente le reti wireless e Internet più in generale sono in grado di fornire un supporto limitato alle applicazioni multimediali. La variabilità di banda, di ritardo e nella perdita possono avere conseguenze gravi sulla qualità con cui viene ricevuto il video e questo può portare a una parziale insoddisfazione o addirittura alla rinuncia della fruizione da parte dell’utente finale.
Le applicazioni multimediali sono spesso sensibili al ritardo e con requisiti di
banda molto stringenti ma di fatto rimango tolleranti nei confronti delle perdite
che possono avvenire durante la trasmissione. Al fine di superare le limitazioni è necessario sviluppare dei meccanismi di adattamento in grado di fare da ponte fra i requisiti delle applicazioni multimediali e le caratteristiche offerte dal livello di trasporto. Diversi approcci sono stati proposti in passato in letteratura per
migliorare la trasmissione dei pacchetti riducendo le perdite; gli approcci variano
dalle soluzioni di compressione efficiente all’aggiunta di ridondanza con tecniche
di forward error correction e ritrasmissioni. Altre tecniche si basano sulla creazione di architetture di rete complesse in grado di garantire la QoS a livello rete dove router oppure altri agenti specializzati applicano diverse politiche di gestione del traffico in base ai valori contenuti nei campi dei pacchetti. Mediante queste architetture il traffico video viene marcato con delle classi di priorità al fine di creare una differenziazione nel traffico a livello rete; in particolare i pacchetti con i privilegi maggiori vengono assegnati alle classi di priorità più elevate e verranno persi con probabilità molto bassa mentre i pacchetti appartenenti alle classi di priorità inferiori saranno trattati alla stregua dei servizi di tipo best-effort. Uno dei principali problemi di questa soluzione riguarda come assegnare in maniera ottimale i singoli pacchetti video alle diverse classi di priorità . Un modo per effettuare questa classificazione è quello di procedere assegnando i pacchetti alle varie classi sulla base dell’importanza che ogni pacchetto ha sulla qualità finale.
E’ stato dimostrato in numerosi lavori recenti che utilizzando come meccanismo
per l’adattamento l’impatto sulla distorsione finale, porta significativi miglioramenti
rispetto alle tecniche che utilizzano come parametro la sensibilità media del flusso nei confronti delle perdite. L’impatto che ogni pacchetto ha sulla qualità può essere espresso come la distorsione che viene introdotta al ricevitore se il pacchetto viene perso tenendo in considerazione gli effetti del recupero (error concealment) e la propagazione dell’errore (error propagation) caratteristica dei più recenti codificatori video.
Gli algoritmi di stima della distorsione proposti in questa tesi sono in grado di riprodurre in maniera accurata l’inviluppo della distorsione derivante sia da perdite isolate che da perdite multiple nella rete con una complessità computazionale minima se confrontata con le più recenti tecniche di stima. Numerose prove sono stati effettuate al fine di validare gli algoritmi di stima e misurare l’influenza dei principali parametri di codifica e di decodifica. Al fine di enfatizzare i benefici ottenuti applicando gli algoritmi di stima della distorsione, durante la tesi verranno presentati alcuni scenari applicativi dove l’applicazione degli algoritmi proposti migliora sensibilmente la qualità finale percepita dagli utenti. Tali scenari verranno descritti, implementati e accuratamente valutati. In particolare, la distorsione stimata dal trasmettitore verrà incapsulata nei pacchetti video e, trasmessa
nella rete dove agenti specializzati potranno agevolmente estrarla e utilizzarla come meccanismo rate-distortion per privilegiare alcuni pacchetti a discapito di altri. In particolare la struttura interna di un agente (un router) verrà modificata al fine di consentire la differenziazione del traffico utilizzando l’informazione dell’impatto che ogni pacchetto ha sulla qualità finale. I risultati ottenuti anche in termini di ridotta complessità computazionale in ogni scenario applicativo proposto mettono in luce i benefici derivanti dall’implementazione degli algoritmi di stima.
La presenti tesi di dottorato è strutturata in due parti principali; la prima fornisce
il background e rappresenta la base per tutti gli argomenti trattati nel seguito mentre
la seconda parte è dedicata ai contributi originali e ai risultati ottenuti durante
l’intera attività di ricerca.
In riferimento alla prima parte in particolare un’introduzione ai principi e alle opportunità offerte dalla diffusione dei servizi multimediali sulle reti a pacchetto
viene esposta nel primo capitolo. I progressi più recenti nelle tecniche di compressione
video vengono esposti dettagliatamente nel secondo capitolo che si focalizza in particolare solo sugli aspetti che riguardano le tecniche per la mitigazione delle perdite. Il terzo capitolo introduce le principali tecniche per proteggere i flussi multimediali e ridurre le perdite causate dai fenomeni caratteristici del canale. Il quarto capitolo descrive i recenti avanzamenti nelle tecniche di network adaptive media transport illustrando i principali metodi utilizzati per differenziare il traffico video. Il quinto capitolo analizza i principali contributi nella letteratura sulle
tecniche di stima della distorsione e si focalizza in particolare sulle limitazioni dei metodi attuali.
La seconda parte della tesi descrive i contributi originali ottenuti nella modellizzazione della distorsione video derivante dalla trasmissione sulle reti con perdite.
In particolare il sesto capitolo presenta tre nuovi algoritmi in grado di riprodurre
fedelmente l’inviluppo della distorsione video. I numerosi test e risultati verranno
proposti al fine di validare gli algoritmi e misurare l’accuratezza nella stima. Il settimo capitolo propone diversi scenari applicativi dove gli algoritmi sviluppati
possono essere utilizzati per migliorare in maniera significativa la qualità percepita
dall’utente finale. Infine l’ottavo capitolo sintetizza l’intero lavoro svolto e i principali risultati ottenuti. Nello stesso capitolo vengono inoltre descritti gli
sviluppi futuri dell’attività di ricerca.
L’obiettivo dell’intero lavoro presentato è quello di mostrare i benefici derivanti
dall’utilizzo di nuovi algoritmi per la stima della distorsione e di fornire alcuni
scenari applicativi di utilizzo.XIX Ciclo197
Wavelet based image compression integrating error protection via arithmetic coding with forbidden symbol and map metric sequential decoding with ARQ retransmission
The phenomenal growth of digital multimedia applications has forced the communication
Low delay video coding
Analogue wireless cameras have been employed for decades, however they have not become an universal solution due to their difficulties of set up and use. The main problem is the link robustness which mainly depends on the requirement of a line-of-sight view between transmitter and receiver, a working condition not always possible. Despite the use of tracking antenna system such as the Portable Intelligent Tracking Antenna (PITA [1]), if strong multipath fading occurs (e.g. obstacles between transmitter and receiver) the picture rapidly falls apart. Digital wireless cameras based on Orthogonal Frequency Division Multiplexing (OFDM) modulation schemes give a valid solution for the above problem. OFDM offers strong multipath protection due to the insertion of the guard interval; in particular, the OFDM-based DVB-T standard has proven to offer excellent performance for the broadcasting of multimedia streams with bit rates over 10 Mbps in difficult terrestrial propagation channels, for fixed and portable applications. However, in typical conditions, the latency needed to compress/decompress a digital video signal at Standard Definition (SD) resolution is of the order of 15 frames, which corresponds to ≃ 0.5 sec. This delay introduces a serious problem when wireless and wired cameras have to be interfaced. Cabled cameras do not use compression, because the cable which directly links transmitter and receiver does not impose restrictive bandwidth constraints. Therefore, the only latency that affects a cable cameras link system is the on cable propagation delay, almost not significant, when switching between wired and wireless cameras, the residual latency makes it impossible to achieve the audio-video synchronization, with consequent disagreeable effects. A way to solve this problem is to provide a low delay digital processing scheme based on a video coding algorithm which avoids massive intermediate data storage. The analysis of the last MPEG based coding standards puts in evidence a series of problems which limits the real performance of a low delay MPEG coding system. The first effort of this work is to study the MPEG standard to understand its limit from both the coding delay and implementation complexity points of views. This thesis also investigates an alternative solution based on HERMES codec, a proprietary algorithm which is described implemented and evaluated. HERMES achieves better results than MPEG in terms of latency and implementation complexity, at the price of higher compression ratios, which means high output bit rates. The use of HERMES codec together with an enhanced OFDM system [2] leads to a competitive solution for wireless digital professional video applications