22 research outputs found
Recommended from our members
Modelling and extraction of fundamental frequency in speech signals
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.One of the most important parameters of speech is the fundamental frequency of vibration of voiced sounds. The audio sensation of the fundamental frequency is known as the pitch. Depending on the tonal/non-tonal category of language, the fundamental frequency conveys intonation, pragmatics and meaning. In addition the fundamental frequency and intonation carry speaker gender, age, identity, speaking style and emotional state. Accurate estimation of the fundamental frequency is critically important for functioning of speech processing applications such as speech coding, speech recognition, speech synthesis and voice morphing. This thesis makes contributions to the development of accurate pitch estimation research in three distinct ways: (1) an investigation of the impact of the window length on pitch estimation error, (2) an investigation of the use of the higher order moments and (3) an investigation of an analysis-synthesis method for selection of the best pitch value among N proposed candidates. Experimental evaluations show that the length of the speech window has a major impact on the accuracy of pitch estimation. Depending on the similarity criteria and the order of the statistical moment a window length of 37 to 80 ms gives the least error. In order to avoid excessive delay as a consequence of using a longer window, a method is proposed
ii where the current short window is concatenated with the previous frames to form a longer signal window for pitch extraction. The use of second order and higher order moments, and the magnitude difference function, as the similarity criteria were explored and compared. A novel method of calculation of moments is introduced where the signal is split, i.e. rectified, into positive and negative valued samples. The moments for the positive and negative parts of the signal are computed separately and combined. The new method of calculation of moments from positive and negative parts and the higher order criteria provide competitive results. A challenging issue in pitch estimation is the determination of the best candidate from N extrema of the similarity criteria. The analysis-synthesis method proposed in this thesis selects the pitch candidate that provides the best reproduction (synthesis) of the harmonic spectrum of the original speech. The synthesis method must be such that the distortion increases with the increasing error in the estimate of the fundamental frequency. To this end a new method of spectral synthesis is proposed using an estimate of the spectral envelop and harmonically spaced asymmetric Gaussian pulses as excitation. The N-best method provides consistent reduction in pitch estimation error. The methods described in this thesis result in a significant improvement in the pitch accuracy and outperform the benchmark YIN method
Data Acquisition Applications
Data acquisition systems have numerous applications. This book has a total of 13 chapters and is divided into three sections: Industrial applications, Medical applications and Scientific experiments. The chapters are written by experts from around the world, while the targeted audience for this book includes professionals who are designers or researchers in the field of data acquisition systems. Faculty members and graduate students could also benefit from the book
Proceedings of the Second International Mobile Satellite Conference (IMSC 1990)
Presented here are the proceedings of the Second International Mobile Satellite Conference (IMSC), held June 17-20, 1990 in Ottawa, Canada. Topics covered include future mobile satellite communications concepts, aeronautical applications, modulation and coding, propagation and experimental systems, mobile terminal equipment, network architecture and control, regulatory and policy considerations, vehicle antennas, and speech compression
Modeling and rendering for development of a virtual bone surgery system
A virtual bone surgery system is developed to provide the potential of a realistic, safe, and controllable environment for surgical education. It can be used for training in orthopedic surgery, as well as for planning and rehearsal of bone surgery procedures...Using the developed system, the user can perform virtual bone surgery by simultaneously seeing bone material removal through a graphic display device, feeling the force via a haptic deice, and hearing the sound of tool-bone interaction --Abstract, page iii
Models and Analysis of Vocal Emissions for Biomedical Applications
The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy. This edition celebrates twenty years of uninterrupted and succesfully research in the field of voice analysis
On the integration of deformation and relief measurement using ESPI
The combination of relief and deformation measurement is investigated for improving
the accuracy of Electronic Speckle-Pattern Interferometry (ESPI) data. The nature of
sensitivity variations within different types of interferometers and with different shapes
of objects is analysed, revealing significant variations for some common
interferometers. Novel techniques are developed for real-time measurement of
dynamic events by means of carrier fringes. This allows quantification of deformation
and relief, where the latter is used in the correction of the sensitivity variations of the
former
Recent Advances in Signal Processing
The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity
Models and analysis of vocal emissions for biomedical applications
This book of Proceedings collects the papers presented at the 3rd International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2003, held 10-12 December 2003, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies
A study of brass instrument acoustics using an artificial lip reed mechanism, laser Doppler anemometry and other techniques
The self-sustained oscillation of a brass wind musical instrument involves a complex aerodynamic coupling between a multimode mechanical vibratory system (the lips of the player) and a multimode acoustical vibratory system (the air column of the instrument). In this thesis the behaviour of the coupled system near the threshold of oscillation is investigated using a simplified model in which a single mechanical lip mode is coupled to a single mode of the acoustical resonator by air flow through the lips. The theoretical threshold behaviour is compared with the measured threshold behaviour of a trombone sounded by an artificial lip reed mechanism. Comparability between theory and experiment is ensured by using model parameter values
derived from mechanical response measurements on the artificial lips and input impedance measurements on the trombone.The mechanical response measurements can be used to classify mechanical modes of the artificial lips unambiguously as either "inward striking" or "outward striking". Each of the
embouchures considered is found to have at least one mechanical mode of each category. The experimentally observed threshold frequencies of the coupled system suggest a behaviour which passes smoothly from "inward striking" to "outward striking" character as the trombone slide is extended or the embouchure parameters changed. It seems unlikely that this type of behaviour can be explained using a lip model with only a single degree of freedom.After a discussion of the theory of laser Doppler anemome!ry (LDA), the technique is applied to the problem of measuring the instantaneous acoustic particle velocity within a standing wave pipe driven by a loudspeaker. The resulting Doppler signals display quasi-periodic amplitude
modulation with a fundamental frequency equal to the frequency of the acoustic field. The phenomenon of amplitude modulation is investigated in some detail.Two different methods of analysing Doppler signals are compared: the digit~l Hilbert transform and the Disa analogue frequency tracker; the analogue tracker is found to offer the greater signal-to-noise ratio and dynamic range. Experiments are carried out to establish how phase
errors introduced by the analogue tracker can be minimised:,Velocity measurements extracted from Doppler signals using the analogue tracker are compared with the velocity deduced by applying basic theory to probe microphone pressure measurements. It is found that the acoustic particle velocity amplitude can be measured accurately
over the entire frequency range considered, and the phase of the acoustic particle velocity also agrees well with theory, but not at low frequencies. LDA is successfully applied to the measurement of multi-harmonic sound fields. The technique of ensemble averaging velocity signals is shown to be particularly useful.LDA is used to measure the velocity in the backbore of a specially designed transparent mouthpiece, driven by the artificial lip reed. Although significant levels of turbulence are encountered, it is shown that acoustic components can still be clearly distinguished in frequency domain representations of the measured velocity. However LDA measurements in the mouthpiece are restricted to conditions where the acoustic particle velocity amplitude and the turbulent intensity are sufficiently low to ensure that the bandwidth of the Doppler signal is less than the bandwidth of the apparatus used to capture or process the Doppler signal.LDA measurements in brass instrument mouthpieces should provide a better understanding of the air flow into the mouthpiece and may lead to an improved model for self-sustained oscillation of the coupled system which more accurately describes the air flow
IberSPEECH 2020: XI Jornadas en Tecnología del Habla and VII Iberian SLTech
IberSPEECH2020 is a two-day event, bringing together the best researchers and practitioners in speech and language technologies in Iberian languages to promote interaction and discussion. The organizing committee has planned a wide variety of scientific and social activities, including technical paper presentations, keynote lectures, presentation of projects, laboratories activities, recent PhD thesis, discussion panels, a round table, and awards to the best thesis and papers. The program of IberSPEECH2020 includes a total of 32 contributions that will be presented distributed among 5 oral sessions, a PhD session, and a projects session. To ensure the quality of all the contributions, each submitted paper was reviewed by three members of the scientific review committee. All the papers in the conference will be accessible through the International Speech Communication Association (ISCA) Online Archive. Paper selection was based on the scores and comments provided by the scientific review committee, which includes 73 researchers from different institutions (mainly from Spain and Portugal, but also from France, Germany, Brazil, Iran, Greece, Hungary, Czech Republic, Ucrania, Slovenia). Furthermore, it is confirmed to publish an extension of selected papers as a special issue of the Journal of Applied Sciences, “IberSPEECH 2020: Speech and Language Technologies for Iberian Languages”, published by MDPI with fully open access. In addition to regular paper sessions, the IberSPEECH2020 scientific program features the following activities: the ALBAYZIN evaluation challenge session.Red Española de Tecnologías del Habla. Universidad de Valladoli