    QoS framework for video streaming in home networks

    In this thesis we present a new SNR scalable video coding scheme. An important advantage of the proposed scheme is that it requires just a standard video decoder for processing each layer. The quality of the delivered video depends on the allocation of bit rates to the base and enhancement layers. For a given total bit rate, the combination with a bigger base layer delivers higher quality. The absence of dependencies between frames in enhancement layers makes the system resilient to losses of arbitrary frames from an enhancement layer. Furthermore, that property can be used in a more controlled fashion. An important characteristic of any video streaming scheme is the ability to handle network bandwidth fluctuations. We made a streaming technique that observes the network conditions and based on the observations reconfigures the layer configuration in order to achieve the best possible quality. A change of the network conditions forces a change in the number of layers or the bit rate of these layers. Knowledge of the network conditions allows delivery of a video of higher quality by choosing an optimal layer configuration. When the network degrades, the amount of data transmitted per second is decreased by skipping frames from an enhancement layer on the sender side. The presented video coding scheme allows skipping any frame from an enhancement layer, thus enabling an efficient real-time control over transmission at the network level and fine-grained control over the decoding of video data. The methodology proposed is not MPEG-2 specific and can be applied to other coding standards. We made a terminal resource manager that enables trade-offs between quality and resource consumption due to the use of scalable video coding in combination with scalable video algorithms. The controller developed for the decoding process optimizes the perceived quality with respect to the CPU power available and the amount of input data. The controller does not depend on the type of scalability technique and can therefore be used with any scalable video. The controller uses the strategy that is created offline by means of a Markov Decision Process. During the evaluation it was found that the correctness of the controller behavior depends on the correctness of parameter settings for MDP, so user tests should be employed to find the optimal settings

    Transport Layer Optimizations for Heterogeneous Wireless Multimedia Networks

    The explosive growth of the Internet during the last few years, has been propelled by the TCP/IP protocol suite and the best effort packet forwarding service. However, quality of service (QoS) is far from being a reality especially for multimedia services like video streaming and video conferencing. In the case of wireless and mobile networks, the problem becomes even worse due to the physics of the medium, resulting into further deterioration of the system performance. Goal of this dissertation is the systematic development of comprehensive models that jointly characterize the performance of transport protocols and media delivery in heterogeneous wireless networks. At the core of our novel methodology, is the use of analytical models for driving the design of media transport algorithms, so that the delivery of conversational and non-interactive multimedia data is enhanced in terms of throughput, delay, and jitter. More speciffically, we develop analytical models that characterize the throughput and goodput of the transmission control protocol (TCP) and the transmission friendly rate control (TFRC) protocol, when CBR and VBR multimedia workloads are considered. Subsequently, we enhance the transport protocol models with new parameters that capture the playback buffer performance and the expected video distortion at the receiver. In this way a complete end-to-end model for media streaming is obtained. This model is used as a basis for a new algorithm for rate-distortion optimized mode selection in video streaming appli- cations. As a next step, we extend the developed models for the aforementioned protocols, so that heterogeneous wireless networks can be accommodated. Subsequently, new algorithms are proposed in order to enhance the developed media streaming algorithms when heterogeneous wireless networks are also included. Finally, the aforementioned models and algorithms are extended for the case of concurrent multipath media transport over several hybrid wired/wireless links.Ph.D.Committee Chair: Vijay Madisetti; Committee Member: Raghupathy Sivakumar; Committee Member: Sudhakar Yalamanchili; Committee Member: Umakishore Ramachandran; Committee Member: Yucel Altunbasa

    Protocole de routage à chemins multiples pour des réseaux ad hoc

    Ad hoc networks consist of a collection of wireless mobile nodes which dynamically exchange data without reliance on any fixed based station or a wired backbone network. They are by definition self-organized. The frequent topological changes make multi-hops routing a crucial issue for these networks. In this PhD thesis, we propose a multipath routing protocol named Multipath Optimized Link State Routing (MP-OLSR). It is a multipath extension of OLSR, and can be regarded as a hybrid routing scheme because it combines the proactive nature of topology sensing and reactive nature of multipath computation. The auxiliary functions as route recovery and loop detection are introduced to improve the performance of the network. The usage of queue length metric for link quality criteria is studied and the compatibility between single path and multipath routing is discussed to facilitate the deployment of the protocol. The simulations based on NS2 and Qualnet softwares are performed in different scenarios. A testbed is also set up in the campus of Polytech’Nantes. The results from the simulator and testbed reveal that MP-OLSR is particularly suitable for mobile, large and dense networks with heavy network load thanks to its ability to distribute the traffic into different paths and effective auxiliary functions. The H.264/SVC video service is applied to ad hoc networks with MP-OLSR. By exploiting the scalable characteristic of H.264/SVC, we propose to use Priority Forward Error Correction coding based on Finite Radon Transform (FRT) to improve the received video quality. An evaluation framework called SVCEval is built to simulate the SVC video transmission over different kinds of networks in Qualnet. This second study highlights the interest of multiple path routing to improve quality of experience over self-organized networks.Les réseaux ad hoc sont constitués d’un ensemble de nœuds mobiles qui échangent des données sans infrastructure de type point d’accès ou artère filaire. Ils sont par définition auto-organisés. Les changements fréquents de topologie des réseaux ad hoc rendent le routage multi-sauts très problématique. Dans cette thèse, nous proposons un protocole de routage à chemins multiples appelé Multipath Optimized Link State Routing (MP-OLSR). C’est une extension d’OLSR à chemins multiples qui peut être considérée comme une méthode de routage hybride. En effet, MP-OLSR combine la caractéristique proactive de la détection de topologie et la caractéristique réactive du calcul de chemins multiples qui est effectué à la demande. Les fonctions auxiliaires comme la récupération de routes ou la détection de boucles sont introduites pour améliorer la performance du réseau. L’utilisation de la longueur des files d’attente des nœuds intermédiaires comme critère de qualité de lien est étudiée et la compatibilité entre routage à chemins multiples et chemin unique est discutée pour faciliter le déploiement du protocole. Les simulations basées sur les logiciels NS2 et Qualnet sont effectuées pour tester le routage MP-OLSR dans des scénarios variés. Une mise en œuvre a également été réalisée au cours de cette thèse avec une expérimentation sur le campus de Polytech’Nantes. Les résultats de la simulation et de l’expérimentation révèlent que MP-OLSR est particulièrement adapté pour les réseaux mobiles et denses avec des trafics élevés grâce à sa capacité à distribuer le trafic dans des chemins différents et à des fonctions auxiliaires efficaces. Au niveau application, le service vidéo H.264/SVC est appliqué à des réseaux ad hoc MP-OLSR. En exploitant la hiérarchie naturelle délivrée par le format H.264/SVC, nous proposons d’utiliser un codage à protection inégale (PFEC) basé sur la Transformation de Radon Finie (FRT) pour améliorer la qualité de la vidéo à la réception. Un outil appelé SVCEval est développé pour simuler la transmission de vidéo SVC sur différents types de réseaux dans le logiciel Qualnet. Cette deuxième étude témoigne de l’intérêt du codage à protection inégale dans un routage à chemins multiples pour améliorer une qualité d’usage sur des réseaux auto-organisés

    Characterisation of noisy speech channels in 2G and 3G mobile networks

    As the wireless cellular market reaches competitive levels never seen before, network operators need to focus on maintaining Quality of Service (QoS) a main priority if they wish to attract new subscribers while keeping existing customers satisfied. Speech Quality as perceived by the end user is one major example of a characteristic in constant need of maintenance and improvement. It is in this topic that this Master Thesis project fits in. Making use of an intrusive method of speech quality evaluation, as a means to further study and characterize the performance of speech codecs in second-generation (2G) and third-generation (3G) technologies. Trying to find further correlation between codecs with similar bit rates, along with the exploration of certain transmission parameters which may aid in the assessment of speech quality. Due to some limitations concerning the audio analyzer equipment that was to be employed, a different system for recording the test samples was sought out. Although the new designed system is not standard, after extensive testing and optimization of the system's parameters, final results were found reliable and satisfactory. Tests include a set of high and low bit rate codecs for both 2G and 3G, where values were compared and analysed, leading to the outcome that 3G speech codecs perform better, under the approximately same conditions, when compared with 2G. Reinforcing the idea that 3G is, with no doubt, the best choice if the costumer looks for the best possible listening speech quality. Regarding the transmission parameters chosen for the experiment, the Receiver Quality (RxQual) and Received Energy per Chip to the Power Density Ratio (Ec/N0), these were subject to speech quality correlation tests. Final results of RxQual were compared to those of prior studies from different researchers and, are considered to be of important relevance. Leading to the confirmation of RxQual as a reliable indicator of speech quality. As for Ec/N0, it is not possible to state it as a speech quality indicator however, it shows clear thresholds for which the MOS values decrease significantly. The studied transmission parameters show that they can be used not only for network management purposes but, at the same time, give an expected idea to the communications engineer (or technician) of the end-to-end speech quality consequences. With the conclusion of the work new ideas for future studies come to mind. Considering that the fourth-generation (4G) cellular technologies are now beginning to take an important place in the global market, as the first all-IP network structure, it seems of great relevance that 4G speech quality should be subject of evaluation. Comparing it to 3G, not only in narrowband but also adding wideband scenarios with the most recent standard objective method of speech quality assessment, POLQA. Also, new data found on Ec/N0 tests, justifies further research studies with the intention of validating the assumptions made in this work.Com o mercado das redes móveis a atingir níveis de competitividade nunca antes vistos, existe a crescente necessidade por parte dos operadores de rede em focar-se na Qualidade de Serviço (QoS) como principal prioridade, no sentido de atrair novos clientes ao mesmo tempo que asseguram a satisfação dos seus actuais assinantes. A percepção da Qualidade de Voz, por parte do utilizador, é apenas um exemplo de uma característica de QoS em constante necessidade de manutenção e melhoramento. Sendo nesta temática em que se insere a Tese de Mestrado. Aplicando um método intrusivo de avaliação de qualidade de voz, como meio para um estudo mais aprofundado e, ao mesmo tempo, caracterizando o desempenho dos codecs de voz para as tecnologias de segunda-geração (2G) e terceira-geração (3G). Investigando nova informação que possa ser retirada da correlação entre codecs com bit rates semelhantes, juntamente com a exploração de determinados 'parâmetros de transmissão os quais podem auxiliar na avaliação da qualidade de voz. Devido a algumas limitações ligadas ao analisador de áudio (requisito neste tipo de aplicações), existiu a necessidade de procurar um sistema distinto para gravação das amostras de teste. Embora o sistema escolhido não seja padronizado para este tipo de ensaios, após vários testes e consequente optimização dos parâmetros do sistema, os resultados finais consideram-se credíveis e satisfatórios. Os testes efectuados incluem um conjunto de codecs de elevado e baixo bit rate, onde a comparação e análise dos resultados levam a concluir que codecs de voz 3G têm melhor desempenho, sob aproximadamente as mesmas condições, comparativamente com os 2G. Reforçando a ideia generalizada que 3G é, sem dúvida, a melhor escolha se o utilizador procura uma solução superior a nível de qualidade de voz. No que diz respeito aos parâmetros de transmissão escolhidos para a experiência, RxQual (Qualidade do sinal Recebido pela estacão móvel) e Ec/N0 (razão entre Energia por chip e a Densidade Espectral de Potência), estes foram sujeitos a testes de correlação com a qualidade de voz. Os resultados de RxQual foram sujeitos a comparação com estudos prévios de outros investigadores, confirmando este parâmetro como um indicador de qualidade de voz bastante fiável. Quanto a Ec/N0, não é possível declará-lo como um indicador de qualidade de voz, no entanto, este demonstra limites claros para os quais os valores de Mean Opinion Score (MOS) decrescem significativamente. Os parâmetros de transmissão estudados demonstram não só que podem ser utilizados com objectivos de gestão de rede mas como também podem fornecer, ao engenheiro (ou técnico), informação relativa ao impacto que poderá existir na qualidade de voz. Com a finalização deste trabalho é possível constatar que novos estudos devem ser efectuados. Considerando que a tecnologia de quarta-geração (4G) começa agora a dar os seus primeiros passos no mercado das redes móveis, como a primeira com arquitectura de rede totalmente orientada para IP, parece de grande importância que esta tecnologia seja sujeita a avaliação. Comparando-a com 3G, não só para banda-estreita (300 a 3400 Hz) como também para cenários de banda-larga (50 a 7000Hz), aplicando o mais recente método normalizado de avaliação de qualidade de voz, o POLQA. Por fim, também se verifica como pertinente uma continuação do estudo relativo a Ec/N0 a fim de validar as ilações retiradas neste trabalho

    SSIM-Inspired Quality Assessment, Compression, and Processing for Visual Communications

    Objective Image and Video Quality Assessment (I/VQA) measures predict image/video quality as perceived by human beings - the ultimate consumers of visual data. Existing research in the area is mainly limited to benchmarking and monitoring of visual data. The use of I/VQA measures in the design and optimization of image/video processing algorithms and systems is more desirable, challenging and fruitful but has not been well explored. Among the recently proposed objective I/VQA approaches, the structural similarity (SSIM) index and its variants have emerged as promising measures that show superior performance as compared to the widely used mean squared error (MSE) and are computationally simple compared with other state-of-the-art perceptual quality measures. In addition, SSIM has a number of desirable mathematical properties for optimization tasks. The goal of this research is to break the tradition of using MSE as the optimization criterion for image and video processing algorithms. We tackle several important problems in visual communication applications by exploiting SSIM-inspired design and optimization to achieve significantly better performance. Firstly, the original SSIM is a Full-Reference IQA (FR-IQA) measure that requires access to the original reference image, making it impractical in many visual communication applications. We propose a general purpose Reduced-Reference IQA (RR-IQA) method that can estimate SSIM with high accuracy with the help of a small number of RR features extracted from the original image. Furthermore, we introduce and demonstrate the novel idea of partially repairing an image using RR features. Secondly, image processing algorithms such as image de-noising and image super-resolution are required at various stages of visual communication systems, starting from image acquisition to image display at the receiver. We incorporate SSIM into the framework of sparse signal representation and non-local means methods and demonstrate improved performance in image de-noising and super-resolution. Thirdly, we incorporate SSIM into the framework of perceptual video compression. We propose an SSIM-based rate-distortion optimization scheme and an SSIM-inspired divisive optimization method that transforms the DCT domain frame residuals to a perceptually uniform space. Both approaches demonstrate the potential to largely improve the rate-distortion performance of state-of-the-art video codecs. Finally, in real-world visual communications, it is a common experience that end-users receive video with significantly time-varying quality due to the variations in video content/complexity, codec configuration, and network conditions. How human visual quality of experience (QoE) changes with such time-varying video quality is not yet well-understood. We propose a quality adaptation model that is asymmetrically tuned to increasing and decreasing quality. The model improves upon the direct SSIM approach in predicting subjective perceptual experience of time-varying video quality

    IberSPEECH 2020: XI Jornadas en Tecnología del Habla and VII Iberian SLTech

    IberSPEECH2020 is a two-day event, bringing together the best researchers and practitioners in speech and language technologies in Iberian languages to promote interaction and discussion. The organizing committee has planned a wide variety of scientific and social activities, including technical paper presentations, keynote lectures, presentation of projects, laboratories activities, recent PhD thesis, discussion panels, a round table, and awards to the best thesis and papers. The program of IberSPEECH2020 includes a total of 32 contributions that will be presented distributed among 5 oral sessions, a PhD session, and a projects session. To ensure the quality of all the contributions, each submitted paper was reviewed by three members of the scientific review committee. All the papers in the conference will be accessible through the International Speech Communication Association (ISCA) Online Archive. Paper selection was based on the scores and comments provided by the scientific review committee, which includes 73 researchers from different institutions (mainly from Spain and Portugal, but also from France, Germany, Brazil, Iran, Greece, Hungary, Czech Republic, Ucrania, Slovenia). Furthermore, it is confirmed to publish an extension of selected papers as a special issue of the Journal of Applied Sciences, “IberSPEECH 2020: Speech and Language Technologies for Iberian Languages”, published by MDPI with fully open access. In addition to regular paper sessions, the IberSPEECH2020 scientific program features the following activities: the ALBAYZIN evaluation challenge session.Red Española de Tecnologías del Habla. Universidad de Valladoli