Search CORE

363 research outputs found

Tone classification of syllable -segmented Thai speech based on multilayer perceptron

Author: Satravaha Nuttavudh
Publication venue: The Research Repository @ WVU
Publication date: 01/05/2002
Field of study

Thai is a monosyllabic and tonal language. Thai makes use of tone to convey lexical information about the meaning of a syllable. Thai has five distinctive tones and each tone is well represented by a single F0 contour pattern. In general, a Thai syllable with a different tone has a different lexical meaning. Thus, to completely recognize a spoken Thai syllable, a speech recognition system has not only to recognize a base syllable but also to correctly identify a tone. Hence, tone classification of Thai speech is an essential part of a Thai speech recognition system.;In this study, a tone classification of syllable-segmented Thai speech which incorporates the effects of tonal coarticulation, stress and intonation was developed. Automatic syllable segmentation, which performs the segmentation on the training and test utterances into syllable units, was also developed. The acoustical features including fundamental frequency (F0), duration, and energy extracted from the processing syllable and neighboring syllables were used as the main discriminating features. A multilayer perceptron (MLP) trained by backpropagation method was employed to classify these features. The proposed system was evaluated on 920 test utterances spoken by five male and three female Thai speakers who also uttered the training speech. The proposed system achieved an average accuracy rate of 91.36%

The Research Repository @ WVU (West Virginia University)

Automatic prosodic analysis for computer aided pronunciation teaching

Author: Bagshaw Paul Christopher
Publication venue: The University of Edinburgh
Publication date: 01/01/1994
Field of study

Correct pronunciation of spoken language requires the appropriate modulation of acoustic characteristics of speech to convey linguistic information at a suprasegmental level. Such prosodic modulation is a key aspect of spoken language and is an important component of foreign language learning, for purposes of both comprehension and intelligibility. Computer aided pronunciation teaching involves automatic analysis of the speech of a non-native talker in order to provide a diagnosis of the learner's performance in comparison with the speech of a native talker. This thesis describes research undertaken to automatically analyse the prosodic aspects of speech for computer aided pronunciation teaching. It is necessary to describe the suprasegmental composition of a learner's speech in order to characterise significant deviations from a native-like prosody, and to offer some kind of corrective diagnosis. Phonological theories of prosody aim to describe the suprasegmental composition of speech..

CiteSeerX

Edinburgh Research Archive

SPEECH RECOGNITION FOR CONNECTED WORD USING CEPSTRAL AND DYNAMIC TIME WARPING ALGORITHMS

Author: MUDA LINDASALWA
Publication venue
Publication date: 01/09/2014
Field of study

Speech Recognition or Speech Recognizer (SR) has become an important tool for people with physical disabilities when handling Home Automation (HA) appliances. This technology is expected to improve the daily life of the elderly and the disabled so that they are always in control over their lives, and continue to live independently, to learn and stay involved in social life. The goal of the research is to solve the constraints of current Malay SR that is still in its infancy stage where there is limited research in Malay words, especially for HA applications. Since, most of the previous works were confined to wired microphone; this limitation of using wireless microphone type makes it an important area of the research. Research was carried out to develop SR word model for five (5) Malay words and five (5) English words as commands to activate and deactivate home appliances

UTPedia

Kysyvän funktion vaikutus spontaanin ja luetun suomen intonaatioon

Author: Anttila Hanna
Publication venue: Helsingin yliopisto
Publication date: 01/01/2008
Field of study

Goals This study aims to map the effect of interrogative function on the intonation of spontaneous and read Finnish. Earlier research shows that the most prominent feature in Finnish question intonation is an appeal to the listener. Question word questions typically start with a high peak which is followed by falling intonation. In yes/no questions, F0 remains on a high level until the word carrying sentence stress and then falls. Final rises are mainly found in intonation clichés such as "Ai mitä?" ("What?") These earlier results are based on read speech and enacted dialogues. In this study, questions and statements found in spontaneous dialogues were compared. These utterances were also compared with read versions of the same utterances. Fundamental frequency values were compared using a mixed model. Contours were also grouped using auditory and visual inspection. Thus it was possible to compare frequencies of contour types according to utterance type and speech style. The position of questions in the F0 distribution of the whole material was also investigated in this study. Method he material consisted of four spontaneous dialogues and their read versions. The speakers were young adults from the Helsinki metropolitan area, four females and four males. The whole material was first divided into broad dialogue function categories arising from the material and F0 curves were calculated for each category. After this, 277 questions and 244 statements were selected for closer inspection. Values reflecting F0 distribution and contour shape were measured from the F0 contours of these utterances. A mixed model was used to analyse the differences. Utterance type, question type, speech style and speaker gender were used as fixed effects. The frequencies of F0 contour types were compared using a Chi square test. Additional material in this study came from eight young female speakers in central Finland. Results and conclusions In the mixed model analysis, significant differences were found both between questions and statements and between spontaneous and read speech. Generally, utterance type affected the variables reflecting contour type while speech style affected the variables reflecting F0 distribution. The effect of question type was not clearly visible. In read speech the contours resembled earlier results more closely. Speakers had different strategies in differentiating between questions and statements. In the whole material, F0 was slightly higher in questions than in statements. The effect of dialectal background could be seen in the contour types. The results show that interrogative function affects intonation in both spontaneous and read Finnish.Tavoitteet Tutkimuksen tarkoituksena on selvittää, miten kysyvä funktio vaikuttaa spontaanin ja luetun suomen intonaatioon. Aiemmat tutkimukset osoittavat, että suomen kysymysintonaatiossa voimakkaimmin ilmenevä piirre on vetoomus kuulijaan. Kysymyssanakysymyksille on tyypillistä alun korkea huippu, jonka jälkeen perustaajuus laskee. Tästä poiketen kO-kysymyksissä perustaajuus säilyy korkealla lausepainolliseen sanaan saakka ja laskee vasta sen jälkeen. Nouseva loppu esiintyy lähinnä kiteytyneissä ilmauksissa kuten "Ai mitä?" Aiemmat tulokset perustuvat lukupuhuntaan ja näyteltyihin dialogeihin. Tutkimuksessa verrattiin spontaanipuheesta löytyviä kysymyksiä ja väitteitä keskenään. Toisena vertailukohtana olivat tutkittavat lauseet lukupuhuntana. Lauseista mitattuja perustaajuusarvoja verrattiin tilastollisen monitasomallin avulla. Lisäksi kontuurit tyypiteltiin auditiivisen ja visuaalisen havainnon perusteella. Tämä mahdollisti kontuurityyppien frekvenssien vertailun lausetyypin ja puhetyylin mukaan. Tutkimuksessa tarkasteltiin myös kysymysten asemaa koko aineiston perustaajuusjakaumassa. Menetelmät Tutkimusaineisto koostui neljästä dialogista sekä litteroitujen vuorosanojen luetuista toisinnoista. Puhujat olivat nuoria aikuisia pääkaupunkiseudulta. Kumpaakin sukupuolta edusti neljä puhujaa. Ensin koko aineisto jaettiin väljiin aineistolähtöisiin dialogifunktioluokkiin, joiden perustaajuuskäyrät laskettiin kokonaisuudessaan. Tämän jälkeen rajattiin 277 kysymystä ja 244 väitettä tarkempaa tutkimusta varten. Ilmauksista laskettiin perustaajuuskäyrät, joista mitattiin jakaumaa ja muotoa kuvastavia tunnuslukuja. Tilastollisen monitasomallin avulla etsittiin selittäviä tekijöitä näissä mittaustuloksissa esiintyviin eroihin. Selittäjinä käytettiin lause- ja kysymystyyppiä, puhetyyliä ja puhujan sukupuolta. Kontuurityyppien esiintymistä vertailtiin Khin neliötestin avulla. Täydentävänä aineistona oli lukupuhuntaa kahdeksalta keskisuomalaiselta naispuhujalta. Tulokset ja johtopäätökset Monitasomallinnuksessa merkitseviä eroja löytyi sekä kysymysten ja väitteiden välillä että spontaanipuheen ja lukupuhunnan välillä. Lausetyypillä oli vaikutusta erityisesti kontuurin muotoon ja puhetyylillä taas perustaajuusjakaumaan. Kysymystyypin vaikutus ei tämän kokoisessa aineistossa näkynyt selvästi. Lukupuhunnassa kontuurit muistuttivat selvemmin aiempien tutkimusten tuloksia. Eri puhujilla oli erilaisia tapoja erottaa kysymykset väitteistä. Koko aineiston tasolla perustaajuus oli hieman korkeampi kysymyksissä kuin väitteissä. Murretaustan vaikutus näkyi kontuurityyppien erilaisena jakaumana keskisuomalaisilla puhujilla. Tulokset osoittavat, että kysyvä funktio vaikuttaa intonaatioon sekä spontaanissa että luetussa suomessa

Helsingin yliopiston digitaalinen arkisto

Optimization-based modeling of suprasegmental speech timing

Author: Windmann Andreas
Publication venue: Universität Bielefeld
Publication date: 01/01/2016
Field of study

Windmann A. Optimization-based modeling of suprasegmental speech timing. Bielefeld: Universität Bielefeld; 2016

Publications at Bielefeld University

SPEECH RECOGNITION FOR CONNECTED WORD USING CEPSTRAL AND DYNAMIC TIME WARPING ALGORITHMS

Author: MUDA LINDASALWA
Publication venue
Publication date: 01/09/2014
Field of study

UTPedia

Methods in prosody

Author
Publication venue
Publication date
Field of study

This book presents a collection of pioneering papers reflecting current methods in prosody research with a focus on Romance languages. The rapid expansion of the field of prosody research in the last decades has given rise to a proliferation of methods that has left little room for the critical assessment of these methods. The aim of this volume is to bridge this gap by embracing original contributions, in which experts in the field assess, reflect, and discuss different methods of data gathering and analysis. The book might thus be of interest to scholars and established researchers as well as to students and young academics who wish to explore the topic of prosody, an expanding and promising area of study

OAPEN Library

Statistical morphological disambiguation with application to disambiguation of pronunciations in Turkish /

Author: Kulekci Oguzhan M.
Külekci Oğuzhan M.
Publication venue
Publication date: 01/01/2006
Field of study

The statistical morphological disambiguation of agglutinative languages suffers from data sparseness. In this study, we introduce the notion of distinguishing tag sets (DTS) to overcome the problem. The morphological analyses of words are modeled with DTS and the root major part-of-speech tags. The disambiguator based on the introduced representations performs the statistical morphological disambiguation of Turkish with a recall of as high as 95.69 percent. In text-to-speech systems and in developing transcriptions for acoustic speech data, the problem occurs in disambiguating the pronunciation of a token in context, so that the correct pronunciation can be produced or the transcription uses the correct set of phonemes. We apply the morphological disambiguator to this problem of pronunciation disambiguation and achieve 99.54 percent recall with 97.95 percent precision. Most text-to-speech systems perform phrase level accentuation based on content word/function word distinction. This approach seems easy and adequate for some right headed languages such as English but is not suitable for languages such as Turkish. We then use a a heuristic approach to mark up the phrase boundaries based on dependency parsing on a basis of phrase level accentuation for Turkish TTS synthesizers

Sabanci University Research Database

Methods in Contemporary Linguistics

Author
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 21/11/2022
Field of study

The present volume is a broad overview of methods and methodologies in linguistics, illustrated with examples from concrete research. It collects insights gained from a broad range of linguistic sub-disciplines, ranging from core disciplines to topics in cross-linguistic and language-internal diversity or to contributions towards language, space and society. Given its critical and innovative nature, the volume is a valuable source for students and researchers of a broad range of linguistic interests

Directory of Open Access Books (DOAB)

Acta Cybernetica : Volume 19. Number 4.

Author
Publication venue
Publication date: 01/01/2010
Field of study

University of Szeged