121 research outputs found

    QoS framework for video streaming in home networks

    Get PDF
    In this thesis we present a new SNR scalable video coding scheme. An important advantage of the proposed scheme is that it requires just a standard video decoder for processing each layer. The quality of the delivered video depends on the allocation of bit rates to the base and enhancement layers. For a given total bit rate, the combination with a bigger base layer delivers higher quality. The absence of dependencies between frames in enhancement layers makes the system resilient to losses of arbitrary frames from an enhancement layer. Furthermore, that property can be used in a more controlled fashion. An important characteristic of any video streaming scheme is the ability to handle network bandwidth fluctuations. We made a streaming technique that observes the network conditions and based on the observations reconfigures the layer configuration in order to achieve the best possible quality. A change of the network conditions forces a change in the number of layers or the bit rate of these layers. Knowledge of the network conditions allows delivery of a video of higher quality by choosing an optimal layer configuration. When the network degrades, the amount of data transmitted per second is decreased by skipping frames from an enhancement layer on the sender side. The presented video coding scheme allows skipping any frame from an enhancement layer, thus enabling an efficient real-time control over transmission at the network level and fine-grained control over the decoding of video data. The methodology proposed is not MPEG-2 specific and can be applied to other coding standards. We made a terminal resource manager that enables trade-offs between quality and resource consumption due to the use of scalable video coding in combination with scalable video algorithms. The controller developed for the decoding process optimizes the perceived quality with respect to the CPU power available and the amount of input data. The controller does not depend on the type of scalability technique and can therefore be used with any scalable video. The controller uses the strategy that is created offline by means of a Markov Decision Process. During the evaluation it was found that the correctness of the controller behavior depends on the correctness of parameter settings for MDP, so user tests should be employed to find the optimal settings

    Optimisation techniques for low bit rate speech coding

    Get PDF
    This thesis extends the background theory of speech and major speech coding schemes used in existing networks to an implementation of GSM full-rate speech compression on a RISC DSP and a multirate application for speech coding. Speech coding is the field concerned with obtaining compact digital representations of speech signals for the purpose of efficient transmission. In this thesis, the background of speech compression, characteristics of speech signals and the DSP algorithms used have been examined. The current speech coding schemes and requirements have been studied. The Global System for Mobile communication (GSM) is a digital mobile radio system which is extensively used throughout Europe, and also in many other parts of the world. The algorithm is standardised by the European Telecommunications Standardisation histitute (ETSI). The full-rate and half-rate speech compression of GSM have been analysed. A real time implementation of the full-rate algorithm has been carried out on a RISC processor GEPARD by Austria Mikro Systeme International (AMS). The GEPARD code has been tested with all of the test sequences provided by ETSI and the results are bit-exact. The transcoding delay is lower than the ETSI requirement. A comparison of the half-rate and full-rate compression algorithms is discussed. Both algorithms offer near toll speech quality comparable or better than analogue cellular networks. The half-rate compression requires more computationally intensive operations and therefore a more powerful processor will be needed due to the complexity of the code. Hence the cost of the implementation of half-rate codec will be considerably higher than full-rate. A description of multirate signal processing and its application on speech (SBC) and speech/audio (MPEG) has been given. An investigation into the possibility of combining multirate filtering and GSM fill-rate speech algorithm. The results showed that multirate signal processing cannot be directly applied GSM full-rate speech compression since this method requires more processing power, causing longer coding delay but did not appreciably improve the bit rate. In order to achieve a lower bit rate, the GSM full-rate mathematical algorithm can be used instead of the standardised ETSI recommendation. Some changes including the number of quantisation bits has to be made before the application of multirate signal processing and a new standard will be required

    Adaptive video delivery using semantics

    Get PDF
    The diffusion of network appliances such as cellular phones, personal digital assistants and hand-held computers has created the need to personalize the way media content is delivered to the end user. Moreover, recent devices, such as digital radio receivers with graphics displays, and new applications, such as intelligent visual surveillance, require novel forms of video analysis for content adaptation and summarization. To cope with these challenges, we propose an automatic method for the extraction of semantics from video, and we present a framework that exploits these semantics in order to provide adaptive video delivery. First, an algorithm that relies on motion information to extract multiple semantic video objects is proposed. The algorithm operates in two stages. In the first stage, a statistical change detector produces the segmentation of moving objects from the background. This process is robust with regard to camera noise and does not need manual tuning along a sequence or for different sequences. In the second stage, feedbacks between an object partition and a region partition are used to track individual objects along the frames. These interactions allow us to cope with multiple, deformable objects, occlusions, splitting, appearance and disappearance of objects, and complex motion. Subsequently, semantics are used to prioritize visual data in order to improve the performance of adaptive video delivery. The idea behind this approach is to organize the content so that a particular network or device does not inhibit the main content message. Specifically, we propose two new video adaptation strategies. The first strategy combines semantic analysis with a traditional frame-based video encoder. Background simplifications resulting from this approach do not penalize overall quality at low bitrates. The second strategy uses metadata to efficiently encode the main content message. The metadata-based representation of object's shape and motion suffices to convey the meaning and action of a scene when the objects are familiar. The impact of different video adaptation strategies is then quantified with subjective experiments. We ask a panel of human observers to rate the quality of adapted video sequences on a normalized scale. From these results, we further derive an objective quality metric, the semantic peak signal-to-noise ratio (SPSNR), that accounts for different image areas and for their relevance to the observer in order to reflect the focus of attention of the human visual system. At last, we determine the adaptation strategy that provides maximum value for the end user by maximizing the SPSNR for given client resources at the time of delivery. By combining semantic video analysis and adaptive delivery, the solution presented in this dissertation permits the distribution of video in complex media environments and supports a large variety of content-based applications

    A parallel H.264/SVC encoder for high definition video conferencing

    Get PDF
    In this paper we present a video encoder specially developed and configured for high definition (HD) video conferencing. This video encoder brings together the following three requirements: H.264/Scalable Video Coding (SVC), parallel encoding on multicore platforms, and parallel-friendly rate control. With the first requirement, a minimum quality of service to every end-user receiver over Internet Protocol networks is guaranteed. With the second one, real-time execution is accomplished and, for this purpose, slice-level parallelism, for the main encoding loop, and block-level parallelism, for the upsampling and interpolation filtering processes, are combined. With the third one, a proper HD video content delivery under certain bit rate and end-to-end delay constraints is ensured. The experimental results prove that the proposed H.264/SVC video encoder is able to operate in real time over a wide range of target bit rates at the expense of reasonable losses in rate-distortion efficiency due to the frame partitioning into slices

    A scalable approach to video summarization and adaptation

    Full text link
    Tesis doctoral inédita. Universidad Autónoma de Madrid, Escuela Politécnica Superior, octubre de 201

    Tatouage du flux compressé MPEG-4 AVC

    Get PDF
    La présente thèse aborde le sujet de tatouage du flux MPEG-4 AVC sur ses deux volets théoriques et applicatifs en considérant deux domaines applicatifs à savoir la protection du droit d auteur et la vérification de l'intégrité du contenu. Du point de vue théorique, le principal enjeu est de développer un cadre de tatouage unitaire en mesure de servir les deux applications mentionnées ci-dessus. Du point de vue méthodologique, le défi consiste à instancier ce cadre théorique pour servir les applications visées. La première contribution principale consiste à définir un cadre théorique pour le tatouage multi symboles à base de modulation d index de quantification (m-QIM). La règle d insertion QIM a été généralisée du cas binaire au cas multi-symboles et la règle de détection optimale (minimisant la probabilité d erreur à la détection en condition du bruit blanc, additif et gaussien) a été établie. Il est ainsi démontré que la quantité d information insérée peut être augmentée par un facteur de log2m tout en gardant les mêmes contraintes de robustesse et de transparence. Une quantité d information de 150 bits par minutes, soit environ 20 fois plus grande que la limite imposée par la norme DCI est obtenue. La deuxième contribution consiste à spécifier une opération de prétraitement qui permet d éliminer les impactes du phénomène du drift (propagation de la distorsion) dans le flux compressé MPEG-4 AVC. D abord, le problème a été formalisé algébriquement en se basant sur les expressions analytiques des opérations d encodage. Ensuite, le problème a été résolu sous la contrainte de prévention du drift. Une amélioration de la transparence avec des gains de 2 dB en PSNR est obtenueThe present thesis addresses the MPEG-4 AVC stream watermarking and considers two theoretical and applicative challenges, namely ownership protection and content integrity verification.From the theoretical point of view, the thesis main challenge is to develop a unitary watermarking framework (insertion/detection) able to serve the two above mentioned applications in the compressed domain. From the methodological point of view, the challenge is to instantiate this theoretical framework for serving the targeted applications. The thesis first main contribution consists in building the theoretical framework for the multi symbol watermarking based on quantization index modulation (m-QIM). The insertion rule is analytically designed by extending the binary QIM rule. The detection rule is optimized so as to ensure minimal probability of error under additive white Gaussian noise distributed attacks. It is thus demonstrated that the data payload can be increased by a factor of log2m, for prescribed transparency and additive Gaussian noise power. A data payload of 150 bits per minute, i.e. about 20 times larger than the limit imposed by the DCI standard, is obtained. The thesis second main theoretical contribution consists in specifying a preprocessing MPEG-4 AVC shaping operation which can eliminate the intra-frame drift effect. The drift represents the distortion spread in the compressed stream related to the MPEG encoding paradigm. In this respect, the drift distortion propagation problem in MPEG-4 AVC is algebraically expressed and the corresponding equations system is solved under drift-free constraints. The drift-free shaping results in gain in transparency of 2 dB in PSNREVRY-INT (912282302) / SudocSudocFranceF

    Joint source and channel coding

    Get PDF

    Temporal Video Transcoding in Mobile Systems

    Get PDF
    La tesi analizza il problema della transcodifica temporale per la trasmissione del video in tempo reale su reti mobili. Viene proposta un’architettura di transcodifica temporale e un nuovo algoritmo di ricalcolo dei vettori di moto per il transcoder temporale H.264. Per fronteggiare il problema della riduzione costante della banda del canale wireless nelle reti infrastrutturate, vengono proposte diverse politiche di frame skipping basate sul dimensionamento del buffer del transcoder per garantire una comunicazione in tempo reale. Il moto di un frame e il numero di frames consecutivi scartati vengono inoltre considerati per migliorare la qualità del video transcodificato. E’ stato inoltre proposto e studiato un sistema di trasmissione video per reti veicolari con protocollo IEEE 802.11, basato su transcodifica temporale. Questo sistema permette di scartare quei frames il cui tempo di trasmissione supera un massimo ritardo ammisssibile al di sopra del quale tali frames non verrebbero comunque visualizzati. Il sistema proposto permette un notevole risparmio di banda e migliora la qualità del video evitando che molti frames consecutivi vengano scartati a causa della congestione
    corecore