271 research outputs found
Recommended from our members
Adaptive intra refresh for robust wireless multi-view video
This thesis was submitted for the award of PhD and was awarded by Brunel University LondonMobile wireless communication technology is a fast developing field and every day new mobile communication techniques and means are becoming available. In this thesis multi-view video (MVV) is also refers to as 3D video. Thus, the 3D video signals through wireless communication are shaping telecommunication industry and academia. However, wireless channels are prone to high level of bit and burst errors that largely deteriorate the quality of service (QoS). Noise along the wireless transmission path can introduce distortion or make a compressed bitstream lose vital information. The error caused by noise progressively spread to subsequent frames and among multiple views due to prediction. This error may compel the receiver to pause momentarily and wait for the subsequent INTRA picture to continue decoding. The pausing of video stream affects the user's Quality of Experience (QoE). Thus, an error resilience strategy is needed to protect the compressed bitstream against transmission errors. This thesis focuses on error resilience Adaptive Intra Refresh (AIR) technique. The AIR method is developed to make the compressed 3D video more robust to channel errors. The process involves periodic injection of Intra-coded macroblocks in a cyclic pattern using H.264/AVC standard. The algorithm takes into account individual features in each macroblock and the feedback information sent by the decoder about the channel condition in order to generate an MVV-AIR map. MVV-AIR map generation regulates the order of packets arrival and identifies the motion activities in each macroblock. Based on the level of motion activity contained in each macroblock, the MVV-AIR map classifies frames as high or low motion macroblocks. A proxy MVV-AIR transcoder is used to validate the efficiency of the generated MVV-AIR map. The MVV-AIR transcoding algorithm uses spatial and views downscaling scheme to convert from MVV to single view. Various experimental results indicate that the proposed error resilient MVV-AIR transcoder technique effectively improves the quality of reconstructed 3D video in wireless networks. A comparison of MVV-AIR transcoder algorithm with some traditional error resilience techniques demonstrates that MVV-AIR algorithm performs better in an error prone channel. Results of simulation revealed significant improvements in both objective and subjective qualities. No additional computational complexity emanates from the scheme while the QoS and QoE requirements are still fully met.Tertiary Institution Trust Fund (TETFund) of Nigeri
Error and Congestion Resilient Video Streaming over Broadband Wireless
In this paper, error resilience is achieved by adaptive, application-layer rateless channel coding, which is used to protect H.264/Advanced Video Coding (AVC) codec data-partitioned videos. A packetization strategy is an effective tool to control error rates and, in the paper, source-coded data partitioning serves to allocate smaller packets to more important compressed video data. The scheme for doing this is applied to real-time streaming across a broadband wireless link. The advantages of rateless code rate adaptivity are then demonstrated in the paper. Because the data partitions of a video slice are each assigned to different network packets, in congestion-prone wireless networks the increased number of packets per slice and their size disparity may increase the packet loss rate from buffer overflows. As a form of congestion resilience, this paper recommends packet-size dependent scheduling as a relatively simple way of alleviating the buffer-overflow problem arising from data-partitioned packets. The paper also contributes an analysis of data partitioning and packet sizes as a prelude to considering scheduling regimes. The combination of adaptive channel coding and prioritized packetization for error resilience with packet-size dependent packet scheduling results in a robust streaming scheme specialized for broadband wireless and real-time streaming applications such as video conferencing, video telephony, and telemedicine
Multimedia over wireless ip networks:distortion estimation and applications.
2006/2007This thesis deals with multimedia communication over unreliable and resource
constrained IP-based packet-switched networks. The focus is on estimating, evaluating
and enhancing the quality of streaming media services with particular regard
to video services. The original contributions of this study involve mainly the
development of three video distortion estimation techniques and the successive
definition of some application scenarios used to demonstrate the benefits obtained
applying such algorithms. The material presented in this dissertation is the result
of the studies performed within the Telecommunication Group of the Department
of Electronic Engineering at the University of Trieste during the course of Doctorate
in Information Engineering.
In recent years multimedia communication over wired and wireless packet based
networks is exploding. Applications such as BitTorrent, music file sharing, multimedia
podcasting are the main source of all traffic on the Internet. Internet radio
for example is now evolving into peer to peer television such as CoolStreaming.
Moreover, web sites such as YouTube have made publishing videos on demand
available to anyone owning a home video camera. Another challenge in the multimedia
evolution is inside the house where videos are distributed over local WiFi
networks to many end devices around the house. More in general we are assisting
an all media over IP revolution, with radio, television, telephony and stored media
all being delivered over IP wired and wireless networks. All the presented applications
require an extreme high bandwidth and often a low delay especially for
interactive applications. Unfortunately the Internet and the wireless networks provide
only limited support for multimedia applications. Variations in network conditions
can have considerable consequences for real-time multimedia applications
and can lead to unsatisfactory user experience. In fact, multimedia applications
are usually delay sensitive, bandwidth intense and loss tolerant applications. In order
to overcame this limitations, efficient adaptation mechanism must be derived
to bridge the application requirements with the transport medium characteristics.
Several approaches have been proposed for the robust transmission of multimedia
packets; they range from source coding solutions to the addition of redundancy with forward error correction and retransmissions. Additionally, other techniques
are based on developing efficient QoS architectures at the network layer or at the
data link layer where routers or specialized devices apply different forwarding
behaviors to packets depending on the value of some field in the packet header.
Using such network architecture, video packets are assigned to classes, in order
to obtain a different treatment by the network; in particular, packets assigned to
the most privileged class will be lost with a very small probability, while packets
belonging to the lowest priority class will experience the traditional best–effort
service. But the key problem in this solution is how to assign optimally video
packets to the network classes. One way to perform the assignment is to proceed
on a packet-by-packet basis, to exploit the highly non-uniform distortion impact
of compressed video. Working on the distortion impact of each individual video
packet has been shown in recent years to deliver better performance than relying
on the average error sensitivity of each bitstream element. The distortion impact
of a video packet can be expressed as the distortion that would be introduced at
the receiver by its loss, taking into account the effects of both error concealment
and error propagation due to temporal prediction.
The estimation algorithms proposed in this dissertation are able to reproduce accurately
the distortion envelope deriving from multiple losses on the network and
the computational complexity required is negligible in respect to those proposed in
literature. Several tests are run to validate the distortion estimation algorithms and
to measure the influence of the main encoder-decoder settings. Different application scenarios are described and compared to demonstrate the benefits obtained
using the developed algorithms. The packet distortion impact is inserted in each
video packet and transmitted over the network where specialized agents manage
the video packets using the distortion information. In particular, the internal structure of the agents is modified to allow video packets prioritization using primarily
the distortion impact estimated by the transmitter. The results obtained will show
that, in each scenario, a significant improvement may be obtained with respect to
traditional transmission policies.
The thesis is organized in two parts. The first provides the background material
and represents the basics of the following arguments, while the other is dedicated
to the original results obtained during the research activity.
Referring to the first part in the first chapter it summarized an introduction to
the principles and challenges for the multimedia transmission over packet networks.
The most recent advances in video compression technologies are detailed
in the second chapter, focusing in particular on aspects that involve the resilience
to packet loss impairments. The third chapter deals with the main techniques
adopted to protect the multimedia flow for mitigating the packet loss corruption due to channel failures. The fourth chapter introduces the more recent advances in
network adaptive media transport detailing the techniques that prioritize the video
packet flow. The fifth chapter makes a literature review of the existing distortion
estimation techniques focusing mainly on their limitation aspects.
The second part of the thesis describes the original results obtained in the modelling
of the video distortion deriving from the transmission over an error prone
network. In particular, the sixth chapter presents three new distortion estimation
algorithms able to estimate the video quality and shows the results of some validation
tests performed to measure the accuracy of the employed algorithms. The
seventh chapter proposes different application scenarios where the developed algorithms may be used to enhance quickly the video quality at the end user side.
Finally, the eight chapter summarizes the thesis contributions and remarks the
most important conclusions. It also derives some directions for future improvements.
The intent of the entire work presented hereafter is to develop some video distortion
estimation algorithms able to predict the user quality deriving from the loss on the network as well as providing the results of some useful applications able to enhance the user experience during a video streaming session.Questa tesi di dottorato affronta il problema della trasmissione efficiente di contenuti
multimediali su reti a pacchetto inaffidabili e con limitate risorse di banda.
L’obiettivo è quello di ideare alcuni algoritmi in grado di predire l’andamento
della qualità del video ricevuto da un utente e successivamente ideare alcune tecniche in grado di migliorare l’esperienza dell’utente finale nella fruizione dei servizi video. In particolare i contributi originali del presente lavoro riguardano lo sviluppo di algoritmi per la stima della distorsione e l’ideazione di alcuni scenari applicativi in molto frequenti dove poter valutare i benefici ottenibili applicando gli algoritmi di stima.
I contributi presentati in questa tesi di dottorato sono il risultato degli studi compiuti con il gruppo di Telecomunicazioni del Dipartimento di Elettrotecnica Elettronica ed Informatica (DEEI) dell’Università degli Studi di Trieste durante il corso di dottorato in Ingegneria dell’Informazione.
Negli ultimi anni la multimedialità, diffusa sulle reti cablate e wireless, sta diventando
parte integrante del modo di utilizzare la rete diventando di fatto il fenomeno più imponente. Applicazioni come BitTorrent, la condivisione di file musicali e multimediali e il podcasting ad esempio costituiscono una parte significativa del traffico attuale su Internet. Quelle che negli ultimi anni erano le prime radio che trsmettevano sulla rete oggi si stanno evolvendo nei sistemi peer
to peer per più avanzati per la diffusione della TV via web come CoolStreaming.
Inoltre siti web come YouTube hanno costruito il loro business sulla memorizzazione/
distribuzione di video creati da chiunque abbia una semplice video camera.
Un’altra caratteristica dell’imponente rivoluzione multimediale a cui stiamo
assistendo è la diffusione dei video anche all’interno delle case dove i contenuti
multimediali vengono distribuiti mediante delle reti wireless locali tra i vari dispositivi finali. Tutt’oggi è in corso una rivoluzione della multimedialità sulle reti
IP con le radio, i televisioni, la telefonia e tutti i video che devono essere distribuiti
sulle reti cablate e wireless verso utenti eterogenei. In generale la gran parte delle
applicazioni multimediali richiedono una banda elevata e dei ritardi molto contenuti specialmente se le applicazioni sono di tipo interattivo. Sfortunatamente le reti wireless e Internet più in generale sono in grado di fornire un supporto limitato alle applicazioni multimediali. La variabilità di banda, di ritardo e nella perdita possono avere conseguenze gravi sulla qualità con cui viene ricevuto il video e questo può portare a una parziale insoddisfazione o addirittura alla rinuncia della fruizione da parte dell’utente finale.
Le applicazioni multimediali sono spesso sensibili al ritardo e con requisiti di
banda molto stringenti ma di fatto rimango tolleranti nei confronti delle perdite
che possono avvenire durante la trasmissione. Al fine di superare le limitazioni è necessario sviluppare dei meccanismi di adattamento in grado di fare da ponte fra i requisiti delle applicazioni multimediali e le caratteristiche offerte dal livello di trasporto. Diversi approcci sono stati proposti in passato in letteratura per
migliorare la trasmissione dei pacchetti riducendo le perdite; gli approcci variano
dalle soluzioni di compressione efficiente all’aggiunta di ridondanza con tecniche
di forward error correction e ritrasmissioni. Altre tecniche si basano sulla creazione di architetture di rete complesse in grado di garantire la QoS a livello rete dove router oppure altri agenti specializzati applicano diverse politiche di gestione del traffico in base ai valori contenuti nei campi dei pacchetti. Mediante queste architetture il traffico video viene marcato con delle classi di priorità al fine di creare una differenziazione nel traffico a livello rete; in particolare i pacchetti con i privilegi maggiori vengono assegnati alle classi di priorità più elevate e verranno persi con probabilità molto bassa mentre i pacchetti appartenenti alle classi di priorità inferiori saranno trattati alla stregua dei servizi di tipo best-effort. Uno dei principali problemi di questa soluzione riguarda come assegnare in maniera ottimale i singoli pacchetti video alle diverse classi di priorità. Un modo per effettuare questa classificazione è quello di procedere assegnando i pacchetti alle varie classi sulla base dell’importanza che ogni pacchetto ha sulla qualità finale.
E’ stato dimostrato in numerosi lavori recenti che utilizzando come meccanismo
per l’adattamento l’impatto sulla distorsione finale, porta significativi miglioramenti
rispetto alle tecniche che utilizzano come parametro la sensibilità media del flusso nei confronti delle perdite. L’impatto che ogni pacchetto ha sulla qualità può essere espresso come la distorsione che viene introdotta al ricevitore se il pacchetto viene perso tenendo in considerazione gli effetti del recupero (error concealment) e la propagazione dell’errore (error propagation) caratteristica dei più recenti codificatori video.
Gli algoritmi di stima della distorsione proposti in questa tesi sono in grado di riprodurre in maniera accurata l’inviluppo della distorsione derivante sia da perdite isolate che da perdite multiple nella rete con una complessità computazionale minima se confrontata con le più recenti tecniche di stima. Numerose prove sono stati effettuate al fine di validare gli algoritmi di stima e misurare l’influenza dei principali parametri di codifica e di decodifica. Al fine di enfatizzare i benefici ottenuti applicando gli algoritmi di stima della distorsione, durante la tesi verranno presentati alcuni scenari applicativi dove l’applicazione degli algoritmi proposti migliora sensibilmente la qualità finale percepita dagli utenti. Tali scenari verranno descritti, implementati e accuratamente valutati. In particolare, la distorsione stimata dal trasmettitore verrà incapsulata nei pacchetti video e, trasmessa
nella rete dove agenti specializzati potranno agevolmente estrarla e utilizzarla come meccanismo rate-distortion per privilegiare alcuni pacchetti a discapito di altri. In particolare la struttura interna di un agente (un router) verrà modificata al fine di consentire la differenziazione del traffico utilizzando l’informazione dell’impatto che ogni pacchetto ha sulla qualità finale. I risultati ottenuti anche in termini di ridotta complessità computazionale in ogni scenario applicativo proposto mettono in luce i benefici derivanti dall’implementazione degli algoritmi di stima.
La presenti tesi di dottorato è strutturata in due parti principali; la prima fornisce
il background e rappresenta la base per tutti gli argomenti trattati nel seguito mentre
la seconda parte è dedicata ai contributi originali e ai risultati ottenuti durante
l’intera attività di ricerca.
In riferimento alla prima parte in particolare un’introduzione ai principi e alle opportunità offerte dalla diffusione dei servizi multimediali sulle reti a pacchetto
viene esposta nel primo capitolo. I progressi più recenti nelle tecniche di compressione
video vengono esposti dettagliatamente nel secondo capitolo che si focalizza in particolare solo sugli aspetti che riguardano le tecniche per la mitigazione delle perdite. Il terzo capitolo introduce le principali tecniche per proteggere i flussi multimediali e ridurre le perdite causate dai fenomeni caratteristici del canale. Il quarto capitolo descrive i recenti avanzamenti nelle tecniche di network adaptive media transport illustrando i principali metodi utilizzati per differenziare il traffico video. Il quinto capitolo analizza i principali contributi nella letteratura sulle
tecniche di stima della distorsione e si focalizza in particolare sulle limitazioni dei metodi attuali.
La seconda parte della tesi descrive i contributi originali ottenuti nella modellizzazione della distorsione video derivante dalla trasmissione sulle reti con perdite.
In particolare il sesto capitolo presenta tre nuovi algoritmi in grado di riprodurre
fedelmente l’inviluppo della distorsione video. I numerosi test e risultati verranno
proposti al fine di validare gli algoritmi e misurare l’accuratezza nella stima. Il settimo capitolo propone diversi scenari applicativi dove gli algoritmi sviluppati
possono essere utilizzati per migliorare in maniera significativa la qualità percepita
dall’utente finale. Infine l’ottavo capitolo sintetizza l’intero lavoro svolto e i principali risultati ottenuti. Nello stesso capitolo vengono inoltre descritti gli
sviluppi futuri dell’attività di ricerca.
L’obiettivo dell’intero lavoro presentato è quello di mostrare i benefici derivanti
dall’utilizzo di nuovi algoritmi per la stima della distorsione e di fornire alcuni
scenari applicativi di utilizzo.XIX Ciclo197
Low delay MPEG2 video encoding
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1997.Includes bibliographical references (leaves 118-120).by Tri D. Tran.M.Eng
Error Resilient Video Coding Using Bitstream Syntax And Iterative Microscopy Image Segmentation
There has been a dramatic increase in the amount of video traffic over the Internet in past several years. For applications like real-time video streaming and video conferencing, retransmission of lost packets is often not permitted. Popular video coding standards such as H.26x and VPx make use of spatial-temporal correlations for compression, typically making compressed bitstreams vulnerable to errors. We propose several adaptive spatial-temporal error concealment approaches for subsampling-based multiple description video coding. These adaptive methods are based on motion and mode information extracted from the H.26x video bitstreams. We also present an error resilience method using data duplication in VPx video bitstreams.
A recent challenge in image processing is the analysis of biomedical images acquired using optical microscopy. Due to the size and complexity of the images, automated segmentation methods are required to obtain quantitative, objective and reproducible measurements of biological entities. In this thesis, we present two techniques for microscopy image analysis. Our first method, “Jelly Filling” is intended to provide 3D segmentation of biological images that contain incompleteness in dye labeling. Intuitively, this method is based on filling disjoint regions of an image with jelly-like fluids to iteratively refine segments that represent separable biological entities. Our second method selectively uses a shape-based function optimization approach and a 2D marked point process simulation, to quantify nuclei by their locations and sizes. Experimental results exhibit that our proposed methods are effective in addressing the aforementioned challenges
Cross-layer Optimized Wireless Video Surveillance
A wireless video surveillance system contains three major components, the video capture and preprocessing, the video compression and transmission over wireless sensor networks (WSNs), and the video analysis at the receiving end. The coordination of different components is important for improving the end-to-end video quality, especially under the communication resource constraint. Cross-layer control proves to be an efficient measure for optimal system configuration. In this dissertation, we address the problem of implementing cross-layer optimization in the wireless video surveillance system.
The thesis work is based on three research projects. In the first project, a single PTU (pan-tilt-unit) camera is used for video object tracking. The problem studied is how to improve the quality of the received video by jointly considering the coding and transmission process. The cross-layer controller determines the optimal coding and transmission parameters, according to the dynamic channel condition and the transmission delay. Multiple error concealment strategies are developed utilizing the special property of the PTU camera motion.
In the second project, the binocular PTU camera is adopted for video object tracking. The presented work studied the fast disparity estimation algorithm and the 3D video transcoding over the WSN for real-time applications. The disparity/depth information is estimated in a coarse-to-fine manner using both local and global methods. The transcoding is coordinated by the cross-layer controller based on the channel condition and the data rate constraint, in order to achieve the best view synthesis quality.
The third project is applied for multi-camera motion capture in remote healthcare monitoring. The challenge is the resource allocation for multiple video sequences. The presented cross-layer design incorporates the delay sensitive, content-aware video coding and transmission, and the adaptive video coding and transmission to ensure the optimal and balanced quality for the multi-view videos.
In these projects, interdisciplinary study is conducted to synergize the surveillance system under the cross-layer optimization framework. Experimental results demonstrate the efficiency of the proposed schemes. The challenges of cross-layer design in existing wireless video surveillance systems are also analyzed to enlighten the future work.
Adviser: Song C
Cross-layer Optimized Wireless Video Surveillance
A wireless video surveillance system contains three major components, the video capture and preprocessing, the video compression and transmission over wireless sensor networks (WSNs), and the video analysis at the receiving end. The coordination of different components is important for improving the end-to-end video quality, especially under the communication resource constraint. Cross-layer control proves to be an efficient measure for optimal system configuration. In this dissertation, we address the problem of implementing cross-layer optimization in the wireless video surveillance system.
The thesis work is based on three research projects. In the first project, a single PTU (pan-tilt-unit) camera is used for video object tracking. The problem studied is how to improve the quality of the received video by jointly considering the coding and transmission process. The cross-layer controller determines the optimal coding and transmission parameters, according to the dynamic channel condition and the transmission delay. Multiple error concealment strategies are developed utilizing the special property of the PTU camera motion.
In the second project, the binocular PTU camera is adopted for video object tracking. The presented work studied the fast disparity estimation algorithm and the 3D video transcoding over the WSN for real-time applications. The disparity/depth information is estimated in a coarse-to-fine manner using both local and global methods. The transcoding is coordinated by the cross-layer controller based on the channel condition and the data rate constraint, in order to achieve the best view synthesis quality.
The third project is applied for multi-camera motion capture in remote healthcare monitoring. The challenge is the resource allocation for multiple video sequences. The presented cross-layer design incorporates the delay sensitive, content-aware video coding and transmission, and the adaptive video coding and transmission to ensure the optimal and balanced quality for the multi-view videos.
In these projects, interdisciplinary study is conducted to synergize the surveillance system under the cross-layer optimization framework. Experimental results demonstrate the efficiency of the proposed schemes. The challenges of cross-layer design in existing wireless video surveillance systems are also analyzed to enlighten the future work.
Adviser: Song C
Adapting Computer Vision Models To Limitations On Input Dimensionality And Model Complexity
When considering instances of distributed systems where visual sensors communicate with remote predictive models, data traffic is limited to the capacity of communication channels, and hardware limits the processing of collected data prior to transmission. We study novel methods of adapting visual inference to limitations on complexity and data availability at test time, wherever the aforementioned limitations exist. Our contributions detailed in this thesis consider both task-specific and task-generic approaches to reducing the data requirement for inference, and evaluate our proposed methods on a wide range of computer vision tasks. This thesis makes four distinct contributions: (i) We investigate multi-class action classification via two-stream convolutional neural networks that directly ingest information extracted from compressed video bitstreams. We show that selective access to macroblock motion vector information provides a good low-dimensional approximation of the underlying optical flow in visual sequences. (ii) We devise a bitstream cropping method by which AVC/H.264 and H.265 bitstreams are reduced to the minimum amount of necessary elements for optical flow extraction, while maintaining compliance with codec standards. We additionally study the effect of codec rate-quality control on the sparsity and noise incurred on optical flow derived from resulting bitstreams, and do so for multiple coding standards. (iii) We demonstrate degrees of variability in the amount of data required for action classification, and leverage this to reduce the dimensionality of input volumes by inferring the required temporal extent for accurate classification prior to processing via learnable machines. (iv) We extend the Mixtures-of-Experts (MoE) paradigm to adapt the data cost of inference for any set of constituent experts. We postulate that the minimum acceptable data cost of inference varies for different input space partitions, and consider mixtures where each expert is designed to meet a different set of constraints on input dimensionality. To take advantage of the flexibility of such mixtures in processing different input representations and modalities, we train biased gating functions such that experts requiring less information to make their inferences are favoured to others. We finally note that, our proposed data utility optimization solutions include a learnable component which considers specified priorities on the amount of information to be used prior to inference, and can be realized for any combination of tasks, modalities, and constraints on available data
- …