Search CORE

5 research outputs found

Saudi Accented Arabic Voice Bank

Author: Alenazi Ammar
Alghamdi Mansour
Alhargan Fayez
Alkanhal Mohammed
Alkhairy Ashraf
Eldesouki Munir
Publication venue: King Saud University. Production and hosting by Elsevier B.V.
Publication date: 31/12/2008
Field of study

AbstractThe aim of this paper is to present an Arabic speech database that represents Arabic native speakers from all the cities of Saudi Arabia. The database is called the Saudi Accented Arabic Voice Bank (SAAVB). Preparing the prompt sheets, selecting the right speakers and transcribing their speech are some of the challenges that faced the project team. The procedures that meet these challenges are highlighted. SAAVB consists of 1033 speakers speak in Modern Standard Arabic with a Saudi accent. The SAAVB content is analyzed and the results are illustrated. The content was verified internally and externally by IBM Cairo and can be used to train speech engines such as automatic speech recognition and speaker verification systems

Elsevier - Publisher Connector

A computational memory and processing model for prosody

Author: Cahn Janet E. (Janet Elizabeth)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1999
Field of study

Thesis (Ph.D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts & Sciences, 1999.Includes bibliographical references (p. 209-226).This thesis links processing in working memory to prosody in speech, and links different working memory capacities to different prosodic styles. It provides a causal account of prosodic differences and an architecture for reproducing them in synthesized speech. The implemented system mediates text-based information through a model of attention and working memory. The main simulation parameter of the memory model quantifies recall. Changing its value changes what counts as given and new information in a text, and therefore determines the intonation with which the text is uttered. Other aspects of search and storage in the memory model are mapped to the remainder of the continuous and categorical features of pitch and timing, producing prosody in three different styles: for small recall values, the exaggerated and sing-song melodies of children's speech; for mid-range values, an adult expressive style; for the largest values, the prosody of a speaker who is familiar with the text, and at times sounds bored or irritated. In addition, because the storage procedure is stochastic, the prosody from simulation to simulation varies, even for identical control parameters. As with with human speech, no two renditions are alike. Informal feedback indicates that the stylistic differences are recognizable and that the prosody is improved over current offerings. A comparison with natural data shows clear and predictable trends although not at significance. However, a comparison within the natural data also did not produce results at significance. One practical contribution of this work is a text mark-up schema consisting of relational annotations to grammatical structures. Another is the product - varied and plausible prosody in synthesized speech. The main theoretical contribution is to show that resource-bound cognitive activity has prosodic correlates, thus providing a rationale for the individual and stylistic differences in melody and rhythm that are ubiquitous in human speech.by Janet Elizabeth Cahn.Ph.D

CiteSeerX

DSpace@MIT

Análisis acústico de las vibrantes del español en habla espontánea

Author: Ortiz de Pinedo Sánchez Núria
Publication venue: 'Edicions de la Universitat de Barcelona'
Publication date: 01/01/2017
Field of study

[spa] En esta tesis doctoral se ha realizado una descripción del comportamiento de las vibrantes del español en habla espontánea. Partimos de los resultados obtenidos en el estudio piloto Análisis acústico de las vibrantes del español en habla espontánea, realizado como trabajo de final del máster de Formación de Profesores de Español como Lengua Extranjera impartido por la Universidad de Barcelona. El corpus utilizado en esta ocasión es mucho más amplio y completo puesto que contempla las dos variedades dialectales peninsulares del español: la variedad septentrional y la variedad meridional. Se han extraído aproximadamente 200 sonidos vibrantes de un total de 10 comunidades (11 corpus): Andalucía (occidental y oriental), Asturias, Canarias, Castilla la Mancha, Castilla y León, Extremadura, Madrid, Murcia, Navarra y el País Vasco; dando como resultado el análisis de 2238 vibrantes en habla no controlada. Este análisis acústico se ha realizado mediante el programa informático Praat.[eng] This PhD thesis is a description of the behavior of Spanish rhotics in spontaneous speech. We start from the results obtained in the pilot study The teaching of pronunciation through spontaneous speech: acoustic analysis of rhotics, made as final work of the master Formación de Profesores de Español como Lengua Extranjera by the University of Barcelona. The corpus used this time is broader and exhaustive because it includes the both Peninsular dialectal varieties of Spanish language: the Northern range and the Southern variety. Appoximately 200 rhotic sounds from a total of 10 autonomous communities (11 corpus) have been drawn: Andalucía, Asturias, Canarias, Castilla la Mancha, Castilla y León, Extremadura, Madrid, Murcia, Navarra and País Vasco; resulting in the analysis of 2238 rhotics in uncontrolled speech. This acoustic analysis was conducted by Praat software

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Tesis Doctorals en Xarxa

Diposit Digital de la Universitat de Barcelona

Esprit '90. Proceedings of the annual Esprit conference. Brussels, 12-15 November 1990. EUR 13148 EN

Author
Publication venue
Publication date
Field of study

The Collection and Preliminary Analysis of a Spontaneous Speech Database

Author: David Goodine
Hong Leung
James Glass
Joseph Polifroni
Michael Phillips
Michal Soclof
Nancy Daly
Stephanie Seneff
Victor Zue
Publication venue
Publication date: 01/01/1989
Field of study

As part of our effort in developing a spoken language system for interactive problem solving, we recently collected a sizeable amount of speech data. This database is composed of spontaneous sentences which were collected during a simulated human/machine dialogue. Since a computer log of the spoken dialogue was maintained, we were able to ask the subjects to provide read versions of the sentences as well. This paper documents the data collection process, and provides some preliminary analyses of the collected data

CiteSeerX

Crossref