729 research outputs found
Independent formant and pitch control applied to singing voice
Thesis (MScIng)--University of Stellenbosch, 2004.ENGLISH ABSTRACT: A singing voice can be manipulated artificially by means of a digital computer for the
purposes of creating new melodies or to correct existing ones. When the fundamental frequency
of an audio signal that represents a human voice is changed by simple algorithms,
the formants of the voice tend to move to new frequency locations, making it sound unnatural.
The main purpose is to design a technique by which the pitch and formants of a
singing voice can be controlled independently.AFRIKAANSE OPSOMMING: Onafhanklike formant- en toonhoogte beheer toegepas op ’n sangstem: ’n Sangstem kan
deur ’n digitale rekenaar gemanipuleer word om nuwe melodie¨e te skep, of om bestaandes
te verbeter. Wanneer die fundamentele frekwensie van ’n klanksein (wat ’n menslike stem
voorstel) deur ’n eenvoudige algoritme verander word, skuif die oorspronklike formante
na nuwe frekwensie gebiede. Dit veroorsaak dat die resultaat onnatuurlik klink. Die hoof
oogmerk is om ’n tegniek te ontwerp wat die toonhoogte en die formante van ’n sangstem
apart kan beheer
Speech enhancement by perceptual adaptive wavelet de-noising
This thesis work summarizes and compares the existing wavelet de-noising methods. Most popular methods of wavelet transform, adaptive thresholding, and musical noise suppression have been analyzed theoretically and evaluated through Matlab simulation. Based on the above work, a new speech enhancement system using adaptive wavelet de-noising is proposed. Each step of the standard wavelet thresholding is improved by optimized adaptive algorithms. The Quantile based adaptive noise estimate and the posteriori SNR based threshold adjuster are compensatory to each other. The combination of them integrates the advantages of these two approaches and balances the effects of noise removal and speech preservation. In order to improve the final perceptual quality, an innovative musical noise analysis and smoothing algorithm and a Teager Energy Operator based silent segment smoothing module are also introduced into the system. The experimental results have demonstrated the capability of the proposed system in both stationary and non-stationary noise environments
Quality aspects of Internet telephony
Internet telephony has had a tremendous impact on how people communicate.
Many now maintain contact using some form of Internet telephony.
Therefore the motivation for this work has been to address the quality aspects
of real-world Internet telephony for both fixed and wireless telecommunication.
The focus has been on the quality aspects of voice communication,
since poor quality leads often to user dissatisfaction. The scope of the work
has been broad in order to address the main factors within IP-based voice
communication.
The first four chapters of this dissertation constitute the background
material. The first chapter outlines where Internet telephony is deployed
today. It also motivates the topics and techniques used in this research.
The second chapter provides the background on Internet telephony including
signalling, speech coding and voice Internetworking. The third chapter
focuses solely on quality measures for packetised voice systems and finally
the fourth chapter is devoted to the history of voice research.
The appendix of this dissertation constitutes the research contributions.
It includes an examination of the access network, focusing on how calls are
multiplexed in wired and wireless systems. Subsequently in the wireless
case, we consider how to handover calls from 802.11 networks to the cellular
infrastructure. We then consider the Internet backbone where most of our
work is devoted to measurements specifically for Internet telephony. The
applications of these measurements have been estimating telephony arrival
processes, measuring call quality, and quantifying the trend in Internet telephony
quality over several years. We also consider the end systems, since
they are responsible for reconstructing a voice stream given loss and delay
constraints. Finally we estimate voice quality using the ITU proposal PESQ
and the packet loss process.
The main contribution of this work is a systematic examination of Internet
telephony. We describe several methods to enable adaptable solutions
for maintaining consistent voice quality. We have also found that relatively
small technical changes can lead to substantial user quality improvements.
A second contribution of this work is a suite of software tools designed to
ascertain voice quality in IP networks. Some of these tools are in use within
commercial systems today
Virtual PCF: Improving VoIP over WLAN performance with legacy clients
Abstract
Voice over IP (VoIP) is one of the fastest growing applications on the Internet.
Concurrently, 802.11 Wireless LANs (WLANs) have become ubiquitous in residential, enterprise, campus and public networks. Currently the majority of traffic on
WLANs is data traffic but as more people use wireless networks as their primary
access medium, a greater portion of traffic will be real-time traffic such as VoIP traffic. Unfortunately 802.11 networks are designed to handle delay-insensitive, bursty
traffic and perform poorly for VoIP streams. Experimental and analytical results
have shown that a single 802.11b access point operating at the maximum 11 Mbps
rate can support only 5 to 10 VoIP connections simultaneously. Intuitively, an 11 Mbps link should support approximately 85 bi-directional 64Kbps (G.711) streams.
The reason for this under-utilization lies primarily in the Distributed Coordination
Function (DCF) used by 802.11 MAC layer. The problem can be addressed by using the optional Point Coordination Function (PCF). However PCF is not widely
implemented in commodity hardware nor likely to be. There is a similar problem
with the proposed 802.11e standard for quality of service. To solve these problems
we propose Virtual PCF, a legacy-client compatible solution to increase the number
of simultaneous VoIP calls. We implement Virtual PCF, a scheme which employs
a variety of techniques to improve both uplink and downlink VoIP QoS. This alleviates delays and packet loss due to DCF contention and doubles the number of
supported VoIP sessions
Speech quality prediction for voice over Internet protocol networks
Merged with duplicate record 10026.1/878 on 03.01.2017 by CS (TIS). Merged with duplicate record 10026.1/1657 on 15.03.2017 by CS (TIS)This is a digitised version of a thesis that was deposited in the University Library. If you are the author please contact PEARL Admin ([email protected]) to discuss options.IP networks are on a steep slope of innovation that will make them the long-term carrier
of all types of traffic, including voice. However, such networks are not designed to support
real-time voice communication because their variable characteristics (e.g. due to delay, delay
variation and packet loss) lead to a deterioration in voice quality. A major challenge in such networks
is how to measure or predict voice quality accurately and efficiently for QoS monitoring
and/or control purposes to ensure that technical and commercial requirements are met.
Voice quality can be measured using either subjective or objective methods. Subjective
measurement (e.g. MOS) is the benchmark for objective methods, but it is slow, time consuming
and expensive. Objective measurement can be intrusive or non-intrusive. Intrusive methods
(e.g. ITU PESQ) are more accurate, but normally are unsuitable for monitoring live traffic
because of the need for a reference data and to utilise the network. This makes non-intrusive
methods(e.g. ITU E-model) more attractive for monitoring voice quality from IP network impairments.
However, current non-intrusive methods rely on subjective tests to derive model
parameters and as a result are limited and do not meet new and emerging applications.
The main goal of the project is to develop novel and efficient models for non-intrusive
speech quality prediction to overcome the disadvantages of current subjective-based methods
and to demonstrate their usefulness in new and emerging VoIP applications. The main contributions
of the thesis are fourfold:
(1) a detailed understanding of the relationships between voice quality, IP network impairments
(e.g. packet loss, jitter and delay) and relevant parameters associated with speech (e.g.
codec type, gender and language) is provided. An understanding of the perceptual effects of
these key parameters on voice quality is important as it provides a basis for the development
of non-intrusive voice quality prediction models. A fundamental investigation of the impact of
the parameters on perceived voice quality was carried out using the latest ITU algorithm for
perceptual evaluation of speech quality, PESQ, and by exploiting the ITU E-model to obtain an
objective measure of voice quality.
(2) a new methodology to predict voice quality non-intrusively was developed. The method
exploits the intrusive algorithm, PESQ, and a combined PESQ/E-model structure to provide a
perceptually accurate prediction of both listening and conversational voice quality non-intrusively.
This avoids time-consuming subjective tests and so removes one of the major obstacles in the
development of models for voice quality prediction. The method is generic and as such has
wide applicability in multimedia applications. Efficient regression-based models and robust
artificial neural network-based learning models were developed for predicting voice quality
non-intrusively for VoIP applications.
(3) three applications of the new models were investigated: voice quality monitoring/prediction
for real Internet VoIP traces, perceived quality driven playout buffer optimization and
perceived quality driven QoS control. The neural network and regression models were both
used to predict voice quality for real Internet VoIP traces based on international links. A new
adaptive playout buffer and a perceptual optimization playout buffer algorithms are presented.
A QoS control scheme that combines the strengths of rate-adaptive and priority marking control
schemes to provide a superior QoS control in terms of measured perceived voice quality is
also provided.
(4) a new methodology for Internet-based subjective speech quality measurement which
allows rapid assessment of voice quality for VoIP applications is proposed and assessed using
both objective and traditional MOS test methods
Towards an interactive framework for robot dancing applications
Estágio realizado no INESC-Porto e orientado pelo Prof. Doutor Fabien GouyonTese de mestrado integrado. Engenharia Electrotécnica e de Computadores - Major Telecomunicações. Faculdade de Engenharia. Universidade do Porto. 200
User-Centric Quality of Service Provisioning in IP Networks
The Internet has become the preferred transport medium for almost every type of communication, continuing to grow, both in terms of the number of users and delivered services. Efforts have been made to ensure that time sensitive applications receive sufficient resources and subsequently receive an acceptable Quality of Service (QoS). However, typical Internet users no longer use a single service at a given point in time, as they are instead engaged in a multimedia-rich experience, comprising of many different concurrent services. Given the scalability problems raised by the diversity of the users and traffic, in conjunction with their increasing expectations, the task of QoS provisioning can no longer be approached from the perspective of providing priority to specific traffic types over coexisting services; either through explicit resource reservation, or traffic classification using static policies, as is the case with the current approach to QoS provisioning, Differentiated Services (Diffserv). This current use of static resource allocation and traffic shaping methods reveals a distinct lack of synergy between current QoS practices and user activities, thus highlighting a need for a QoS solution reflecting the user services.
The aim of this thesis is to investigate and propose a novel QoS architecture, which considers the activities of the user and manages resources from a user-centric perspective. The research begins with a comprehensive examination of existing QoS technologies and mechanisms, arguing that current QoS practises are too static in their configuration and typically give priority to specific individual services rather than considering the user experience. The analysis also reveals the potential threat that unresponsive application traffic presents to coexisting Internet services and QoS efforts, and introduces the requirement for a balance between application QoS and fairness.
This thesis proposes a novel architecture, the Congestion Aware Packet Scheduler (CAPS), which manages and controls traffic at the point of service aggregation, in order to optimise the overall QoS of the user experience. The CAPS architecture, in contrast to traditional QoS alternatives, places no predetermined precedence on a specific traffic; instead, it adapts QoS policies to each individual’s Internet traffic profile and dynamically controls the ratio of user services to maintain an optimised QoS experience. The rationale behind this approach was to enable a QoS optimised experience to each Internet user and not just those using preferred services. Furthermore, unresponsive bandwidth intensive applications, such as Peer-to-Peer, are managed fairly while minimising their impact on coexisting services.
The CAPS architecture has been validated through extensive simulations with the topologies used replicating the complexity and scale of real-network ISP infrastructures. The results show that for a number of different user-traffic profiles, the proposed approach achieves an improved aggregate QoS for each user when compared with Best effort Internet, Traditional Diffserv and Weighted-RED configurations. Furthermore, the results demonstrate that the proposed architecture not only provides an optimised QoS to the user, irrespective of their traffic profile, but through the avoidance of static resource allocation, can adapt with the Internet user as their use of services change.France Teleco
Models and Analysis of Vocal Emissions for Biomedical Applications
The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies
- …