Search CORE

35 research outputs found

Applications of analysis and synthesis techniques for complex sounds

Author: ZHU XINGLEI
Publication venue
Publication date: 23/12/2004
Field of study

Master'sMASTER OF SCIENC

ScholarBank@NUS

Error resilience and concealment techniques for high-efficiency video coding

Author: Joao F.M. Carreira (7185014)
Publication venue
Publication date: 01/01/2018
Field of study

This thesis investigates the problem of robust coding and error concealment in High Efficiency Video Coding (HEVC). After a review of the current state of the art, a simulation study about error robustness, revealed that the HEVC has weak protection against network losses with significant impact on video quality degradation. Based on this evidence, the first contribution of this work is a new method to reduce the temporal dependencies between motion vectors, by improving the decoded video quality without compromising the compression efficiency. The second contribution of this thesis is a two-stage approach for reducing the mismatch of temporal predictions in case of video streams received with errors or lost data. At the encoding stage, the reference pictures are dynamically distributed based on a constrained Lagrangian rate-distortion optimization to reduce the number of predictions from a single reference. At the streaming stage, a prioritization algorithm, based on spatial dependencies, selects a reduced set of motion vectors to be transmitted, as side information, to reduce mismatched motion predictions at the decoder. The problem of error concealment-aware video coding is also investigated to enhance the overall error robustness. A new approach based on scalable coding and optimally error concealment selection is proposed, where the optimal error concealment modes are found by simulating transmission losses, followed by a saliency-weighted optimisation. Moreover, recovery residual information is encoded using a rate-controlled enhancement layer. Both are transmitted to the decoder to be used in case of data loss. Finally, an adaptive error resilience scheme is proposed to dynamically predict the video stream that achieves the highest decoded quality for a particular loss case. A neural network selects among the various video streams, encoded with different levels of compression efficiency and error protection, based on information from the video signal, the coded stream and the transmission network. Overall, the new robust video coding methods investigated in this thesis yield consistent quality gains in comparison with other existing methods and also the ones implemented in the HEVC reference software. Furthermore, the trade-off between coding efficiency and error robustness is also better in the proposed methods

Loughborough University Institutional Repository

Review of Research on Speech Technology: Main Contributions From Spanish Research Groups

Author: Martínez Hinarejos Carlos D.
Ortega Alfonso
San Segundo Hernández Rubén
Publication venue: E.T.S.I. Telecomunicación (UPM)
Publication date: 01/01/2011
Field of study

In the last two decades, there has been an important increase in research on speech technology in Spain, mainly due to a higher level of funding from European, Spanish and local institutions and also due to a growing interest in these technologies for developing new services and applications. This paper provides a review of the main areas of speech technology addressed by research groups in Spain, their main contributions in the recent years and the main focus of interest these days. This description is classified in five main areas: audio processing including speech, speaker characterization, speech and language processing, text to speech conversion and spoken language applications. This paper also introduces the Spanish Network of Speech Technologies (RTTH. Red Temática en Tecnologías del Habla) as the research network that includes almost all the researchers working in this area, presenting some figures, its objectives and its main activities developed in the last years

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

RiuNet

Archivo Digital UPM

Quality aspects of Internet telephony

Author: Marsh Ian R.
Publication venue
Publication date: 01/01/2009
Field of study

Internet telephony has had a tremendous impact on how people communicate. Many now maintain contact using some form of Internet telephony. Therefore the motivation for this work has been to address the quality aspects of real-world Internet telephony for both fixed and wireless telecommunication. The focus has been on the quality aspects of voice communication, since poor quality leads often to user dissatisfaction. The scope of the work has been broad in order to address the main factors within IP-based voice communication. The first four chapters of this dissertation constitute the background material. The first chapter outlines where Internet telephony is deployed today. It also motivates the topics and techniques used in this research. The second chapter provides the background on Internet telephony including signalling, speech coding and voice Internetworking. The third chapter focuses solely on quality measures for packetised voice systems and finally the fourth chapter is devoted to the history of voice research. The appendix of this dissertation constitutes the research contributions. It includes an examination of the access network, focusing on how calls are multiplexed in wired and wireless systems. Subsequently in the wireless case, we consider how to handover calls from 802.11 networks to the cellular infrastructure. We then consider the Internet backbone where most of our work is devoted to measurements specifically for Internet telephony. The applications of these measurements have been estimating telephony arrival processes, measuring call quality, and quantifying the trend in Internet telephony quality over several years. We also consider the end systems, since they are responsible for reconstructing a voice stream given loss and delay constraints. Finally we estimate voice quality using the ITU proposal PESQ and the packet loss process. The main contribution of this work is a systematic examination of Internet telephony. We describe several methods to enable adaptable solutions for maintaining consistent voice quality. We have also found that relatively small technical changes can lead to substantial user quality improvements. A second contribution of this work is a suite of software tools designed to ascertain voice quality in IP networks. Some of these tools are in use within commercial systems today

Publikationer från KTH

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Packet prioritizing and delivering for multimedia streaming

Author: NGUYEN VU THANH
Publication venue
Publication date: 29/08/2008
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

Quality of Service Controlled Multimedia Transport Protocol

Author: Carvalho Antonio J.
Publication venue: 'Queen Mary University of London'
Publication date: 01/01/2000
Field of study

PhDThis research looks at the design of an open transport protocol that supports a range of services including multimedia over low data-rate networks. Low data-rate multimedia applications require a system that provides quality of service (QoS) assurance and flexibility. One promising field is the area of content-based coding. Content-based systems use an array of protocols to select the optimum set of coding algorithms. A content-based transport protocol integrates a content-based application to a transmission network. General transport protocols form a bottleneck in low data-rate multimedia communicationbsy limiting throughpuot r by not maintainingt iming requirementsT. his work presents an original model of a transport protocol that eliminates the bottleneck by introducing a flexible yet efficient algorithm that uses an open approach to flexibility and holistic architectureto promoteQ oS.T he flexibility andt ransparenccyo mesi n the form of a fixed syntaxt hat providesa seto f transportp rotocols emanticsT. he mediaQ oSi s maintained by defining a generic descriptor. Overall, the structure of the protocol is based on a single adaptablea lgorithm that supportsa pplication independencen, etwork independencea nd quality of service. The transportp rotocol was evaluatedth rougha set of assessmentos:f f-line; off-line for a specific application; and on-line for a specific application. Application contexts used MPEG-4 test material where the on-line assessmenuts eda modified MPEG-4 pl; yer. The performanceo f the QoSc ontrolledt ransportp rotocoli s often bettert hano thers chemews hen appropriateQ oS controlledm anagemenatl gorithmsa re selectedT. his is shownf irst for an off-line assessmenwt here the performancei s compared between the QoS controlled multiplexer,a n emulatedM PEG-4F lexMux multiplexers chemea, ndt he targetr equirements. The performanceis also shownt o be better in a real environmentw hen the QoS controlled multiplexeri s comparedw ith the real MPEG-4F lexMux scheme

Queen Mary Research Online

Uncertainty in Signal Estimation and Stochastic Weighted Viterbi Algorithm: A Unified Framework to Address Robustness in Speech Recognition and Speaker Verification

Author: C. Garreton
C. Molina
F. Huenupan
N. Becerra Yoma
Publication venue: 'IntechOpen'
Publication date: 01/06/2007
Field of study

IntechOpen