Search CORE

378 research outputs found

Investigating sources of phonological rarity and instability: a study of the palatal lateral approximant in Brazilian Portuguese

Author: Wong Nicole Wannie
Publication venue
Publication date: 01/05/2017
Field of study

The palatal lateral is a rare sound in the world's languages; a review of the literature reveals just 23 languages that currently possess the palatal lateral. Similarly, only 15 (or 3.33%) of the languages in the UCLA Phonological Segment Inventory Database (UPSID) (Maddieson and Precoda, 1991) can claim to currently possess the palatal lateral. While UPSID reports that an additional five languages (Basque, Guarani, Iate, Spanish, Turkish) possess the palatal lateral, these languages have either lost the palatal lateral or were included erroneously. Understanding the production and perception of rare speech sounds is important for understanding the distribution of speech sounds cross-linguistically, especially with regards to the establishment of a single phonetic alphabet (i.e. the International Phonetic Alphabet (IPA)) that can be used to describe and transcribe the languages of the world (Ladefoged and Everett, 1996). An investigation of rare speech sounds can also reveal important findings regarding the physical limitations of the vocal tract and human auditory system. Given that the palatal lateral is a rare speech sound, a complete description of the articulation, acoustics, and perception of this sound does not currently exist. Accounts of the palatal lateral vary with regards to terminology; the palatal lateral has also been referred to as a so-called "phonemically" palatalized lateral (Zilyns'kyj, 1979), a laminal post-alveolar lateral (Ladefoged and Maddieson, 1996), and an alveolopalatal lateral (Recasens, 2013). Furthermore, current literature also does not distinguish between the palatal lateral and a palatalized lateral. The lack of agreement in literature regarding terminology can present problems when attempting to assess whether a palatal lateral in one language is similar to a palatal lateral in another language. This dissertation provides a comprehensive description of the palatal lateral, as a means of initiating cross-linguistic comparisons of the palatal lateral as well as understanding the difference between a palatal and palatalized lateral. A two-part study of the articulation and acoustics of the palatal lateral in Brazilian Portuguese (BP) was undertaken in this dissertation. Articulatory data was collected using electromagenetic articulography (EMA) from 10 female native speakers of BP from São Paulo state in Brazil, which permitted the simultaneous collection of acoustic information. Study 1 investigated the articulation of the palatal lateral through a battery of measures and compares the palatal lateral against the palatalized lateral approximant, alveolar lateral approximant, palatal approximant, palatal nasal, palatalized nasal, and alveolar nasal. Study 2 analyzes the acoustics of the palatal lateral in comparison to the palatalized lateral approximant, alveolar lateral approximant, and palatal approximant. A third study was included in the appendix. This study incorporates a phone identification task to understand the role of acoustic saliency in the rareness of the palatal lateral, i.e. compared to other palatal sounds, is the palatal lateral more likely to be misidentified and if so, as which sounds? This task also investigates whether there is a perceived difference between the palatal and palatalized lateral that may not be captured by Study 1 and 2, in addition to whether native speakers of BP are better at distinguishing the two sounds than non-native speakers (here, native speakers of American English). The palatal lateral was compared to the palatalized lateral, palatal approximant, alveolar lateral approximant, palatal nasal, palatalized nasal, alveolar nasal, voiced alveolar stop, and voiced palatalized alveolar stop. 25 (11 male, 14 female) natives speakers of BP and 20 (11 male, 9 female) native speakers of American English with no extensive exposure to BP participated in this study. Results from Study 1 show that the palatal lateral is articulated laminally with a high front tongue body and concave anterior tongue shape that gradually becomes straighter as the phone progresses. Acoustic results in Study 2 indicate a median F1, F2, and F3 of 367 Hz, 1954 Hz, and 3035 Hz respectively for female speakers of BP. Statistical analysis reveals little or no evidence of significant difference between the palatal lateral and palatalized lateral with regards to the shape of the tongue body, duration of the phone, or formant frequencies. The perception study included in the appendix finds that while both native and non-native speakers of BP distinguish between the palatal lateral and palatalized lateral at chance level, native speakers of BP perform better than the non-native speakers at correctly identifying the palatal and palatalized nasal. This study also finds that of all the sounds included in this task, the palatal and palatalized lateral are the most likely to be misidentified as the palatal approximant for both participant groups, with the addition of -3 dB of speech-shaped noise greatly increasing the rate of confusion. However, the palatalized lateral is inaccurately identified as a palatal approximant at a confusion rate nearly double or more than the palatal lateral. This dissertation reveals that the palatal and palatalized lateral are essentially the same sound in BP. Furthermore, there is no evidence that indicates that the palatal or palatalized lateral are composed of two separate phones, i.e. an alveolar lateral approximant followed by a palatal approximant. Findings from the perception study support the proposal that yeísmo (i.e. the merger of the palatal lateral in favor of the palatal approximant (Colantoni, 2001; Hualde et al., 2005)) occurs because lateral sounds are less robust against added noise than nasal sounds. I argue here that this contributes directly to the rareness of the palatal lateral

Illinois Digital Environment for Access to Learning and Scholarship Repository

Estudo acústico das consoantes líquidas do Português Europeu: evidências temporais e espectrais

Author: Rodrigues Susana
Publication venue: 'Associacao Portuguesa de Linguistica'
Publication date: 01/01/2013
Field of study

In this paper, we present data resulting from a temporal and spectral analysis of European Portuguese liquid consonants (/l, ʎ, ɾ, ʀ/) produced in onset position by twonative speakers. The results obtained for the alveolar lateral indicate a velarized /l/ in onset position, as other studies for European Portuguese had already showed. It was also observed that /ɾ/ is a liquid with highest values of F1 frequency and is the consonantwith the shortest duration. The highest frequency of F2 was obtained for the palatal lateral. In this study, the values of F3 and F4 frequency suggest no major differences between liquid consonants. There was some inter-speaker variability.info:eu-repo/semantics/publishedVersio

Universidade de Lisboa: Repositório.UL

L2 speech learning of European Portuguese /l/ and /ɾ/ by L1-Mandarin learners: experimental evidence and theoretical modelling

Author: Chao Zhou
Publication venue
Publication date: 02/03/2021
Field of study

It has been long recognized that the poor distinction between /l/ and /ɾ/ is one of the most perceptible characteristics in Chinese-accented Portuguese. Recent empirical research revealed that this notorious L2 speech learning difficulty goes beyond the confusion between two L2 categories, as L1-Mandarin learners’ acquisition of Portuguese /l/ and /ɾ/ seems to be subject to the interaction among different prosodic positions, speech modalities and representational levels. This thesis aims to deepen our current understanding of this L2 speech learning process, by exploring what constrains the development of L2 phonological categories across syllable positions and how different modalities interact during this process. To achieve this goal, both experimental tasks and theoretical modelling were employed. The first study of this thesis explores the role of cross-linguistic influence and orthography on L2 category formation. In order to elicit cross-linguistic influence directly, a delayed-imitation task was performed with L1-Mandarin naïve listeners. This task examined how the Mandarin phonology parses the Portuguese input ([l], [ɾ]) in intervocalic onset and in word-internal coda position. Moreover, whether orthography plays a role during the construction of L2 phonological representation was tested by manipulating the input types that were given in the experiment (auditory input alone vs. auditory + written input). Our study shows that naïve Mandarin listeners’ responses corroborated with that of L1-Mandarin learners, suggesting that cross-linguistic influence is responsible for the observed L2 prosodic effects. Moreover, the Mandarin [ɻ] (a repair strategy for /ɾ/) occurred almost exclusively when the written form was given, providing evidence for the cross-linguistic interaction between phonological categorization and orthography during the construction of L2 categories. In the second study, we first investigate the interaction between speech perception and production in L2 speech learning, by examining whether the L2 deviant productions stem from misperception and whether the order of acquisition in L2 speech perception mirrors that in production. Secondly, we test whether L2 phonological categories remain malleable at a mid-late stage of L2 speech learning. Two perceptual experiments were performed to test L1-Mandarin learners on their discrimination ability between the target Portuguese form and the deviant form employed in L2 production. Expanding on prior research, in this study, the perceptual motivation for L2 speech difficulties was assessed in different syllable constituents (onset and coda) and at both segmental and suprasegmental levels (structural modification). The results demonstrate that some deviant forms observed in L2 production indeed have a perceptual motivation ([w] for the velarised lateral; [l] and [ɾə] for the tap), while some others cannot be attributed to misperception (deletion of syllable-final tap). Furthermore, learners confused the intervocalic /l/ and /ɾ/ bidirectionally in perception, while in production they never misproduced the lateral (/ɾ/ → [l], */l/ → [ɾ]), revealing a mismatch between two speech modalities. By contrast, the order of acquisition (/ɾ/coda > /ɾ/onset) was shown to be consistent in L2 perception and production. The correspondence and discrepancy between the two speech modalities signal a complex relationship between L2 speech perception and production. To assess the plasticity of L2 categories /l/ and /ɾ/, two groups of L1-Mandarin learners who differ substantially in terms of L2 experience were recruited in the perceptual tasks. Our study shows that both groups behaved similarly in terms of the discrimination performance. No evidence for a role of L2 experience was found. The implication of this null result on L2 phonological development is discussed. The third study of the thesis aims to contribute to bridging the gap between the L2 experimental evidence and formal theories. Adopting the Bidirectional Phonology and Phonetics Model, we formalise some of the experimental findings that cannot be elucidated by current L2 speech theories, namely, the between and within-subject variation in L2 phonological categorization; the interaction between phonological categorization and orthography during L2 category construction; and the asymmetry between L2 perception and production. Overall, this thesis sheds light on the complex nature of L2 phonological acquisition and provides a formal account of how different modalities interact in shaping L2 speech learning. Moreover, it puts forward testable predictions for future research and suggestions for improving foreign language teaching/training methodologies.É bem conhecido o facto de as trocas associadas a /l/ e /ɾ/ constituírem uma das caraterísticas mais percetíveis no português articulado pelos aprendentes chineses. Recentemente, estudos empíricos revelam que a dificuldade por parte dos aprendentes chineses não se restringe à discriminação moderada entre as duas categorias da L2, dado que a aquisição de /l/ e /ɾ/ do português por aprendentes chineses parece estar sujeita à interação entre contextos prosódicos, entre modalidades de fala e entre níveis representacionais diferentes. Esta tese visa aprofundar a nossa compreensão deste processo da aquisição fonológica L2, explorando o que condiciona o desenvolvimento das categorias fonológicas L2 em diferentes constituintes silábicos e de que modo as modalidades interagem durante este processo, recorrendo para tal a tarefas experimentais bem como a formalização teórica. O primeiro estudo averigua o papel da influência interlinguística e o da ortografia na construção das categorias de L2. Para elicitar a influência interlinguística diretamente, uma tarefa de imitação retardada foi aplicada aos falantes nativos do mandarim sem conhecimento de português, investigando assim como a fonologia do mandarim categoriza o input do português ([l], [ɾ]) em ataque simples intervocálico e em coda medial. Para além disso, a influência ortográfica na construção de representações fonológicas em L2 foi examinada através da manipulação do tipo do input apresentado na experiência (input auditivo vs. input auditivo + ortográfico). Os resultados da situação experimental em que os participantes receberam input de ambos os tipos replicaram o efeito prosódico observado na literatura, evidenciando a interação entre categorização fonológica e ortografia na construção das categorias de L2. No segundo estudo, investigamos a interação entre a perceção e a produção de fala na aquisição das líquidas do PE por aprendentes chineses e a plasticidade destas categorias fonológicas, respondendo às questões seguintes: 1) as produções desviantes de L2 resultam da perceção incorreta? 2) a ordem da aquisição em L2 é consistente na perceção e na produção? 3) as categorias da L2 permanecem maleáveis numa fase intermédia da aquisição? Duas tarefas percetivas foram conduzidas para testar a capacidade percetiva dos aprendentes nativos do mandarim em relação à discriminação entre a forma alvo do português e as formas desviantes utilizadas na produção. No presente estudo, a motivação percetiva das dificuldades em L2 foi testada nos constituintes silábicos diferentes (ataque simples e coda) e nos níveis segmental e suprassegmental (modificação estrutural). Os resultados demonstram que algumas formas desviantes que os aprendentes chineses produzem têm uma motivação percetiva (i.e. [w] para a lateral velarizada; [l] e [ɾə] para a vibrante alveolar), enquanto outras não podem ser analisadas como casos de perceção incorreta (como é o caso do o apagamento da vibrante em coda). Para além disso, na posição intervocálica, os aprendentes manifestam dificuldade na discriminação entre /l/ e /ɾ/ de forma bidirecional, mas, na produção, a lateral nunca é produzida incorretamente (/ɾ/ → [l], */l/ → [ɾ]). Tal revela uma divergência entre as duas modalidades de fala. Por contraste, mostrou-se que a ordem da aquisição (/ɾ/coda > /ɾ/ataque) é consistente na perceção e na produção da L2. A correspondência e a discrepância entre as duas modalidades de fala, sinalizam uma relação complexa entre a perceção e a produção na aquisição fonológica de L2. Em relação à questão da plasticidade das categorias de L2, recrutaram-se para as tarefas percetivas dois grupos de aprendentes nativos do mandarim que se diferenciavam substancialmente em termos da experiência em L2. Não se encontrou um efeito significativo da experiência da L2. A implicação deste resultado nulo no desenvolvimento fonológico de L2 foi discutida. O terceiro estudo desta tese tem como objetivo contribuir para a colmatação das lacunas entre estudos empíricos de L2 e as teorias formais. Adotando o Modelo Bidirecional de Fonologia e Fonética, formalizamos os resultados experimentais que as teorias atuais da aquisição fonológica de L2 não conseguem explicar, nomeadamente, a variação inter e intra-sujeitos na categorização fonológica em L2; a interação entre categorização fonológica e ortografia na construção das categorias na L2; a assimetria entre a perceção e a produção na L2. Em suma, esta tese contribui com dados empíricos para a discussão da relação complexa entre a perceção, produção e ortografia na aquisição fonológica de L2 e formaliza a interação entre essas modalidades através de um modelo linguístico generativo. Além disso, apresentam-se predições testáveis para investigação futura e sugestões para o aperfeiçoamento das metodologias de ensino/treino da língua não materna

Universidade de Lisboa: Repositório.UL

European Portuguese MRI based speech production studies

Author: Adams
Alda Pinto
Alwan
António Teixeira
Augusto Silva
Baer
Bryman
Dang
Dang
Demolin
Farnetani
Gick
Hardcastle
Hardcastle
Hoole
Hoole
Inês Carbone
Jesus
Kim
Kiritani
Kühnert
Ladefoged
Magen
Manuel
Mathiak
Morais Barbosa
Narayanan
Narayanan
Narayanan
Narayanan
Narayanan
Paula Martins
Perkell
Recasens
Recasens
Recasens
Recasens
Sachs
Santos
Serrurier
Shadle
Stone
Stone
Story
Strevens
Sá Nogueira
Takemoto
Teixeira
Tiede
Tuller
Viana
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Learning and teaching trends

Author
Publication venue
Publication date: 01/01/2021
Field of study

The purpose of this book is to present recent studies in the field of multilingualism and L3, bringing together contributions from an international group of specialists from Austria, Canada, Germany, Portugal, Spain, Switzerland, Turkey, and United States. The main focuses of the articles are three: language acquisition, language learning and teaching. A collection of theoretical and empirical articles from scholars of multilingualism and language acquisition makes the book a significant resource as the papers present a wide perspective from main theories to current issues, reflecting new trends in the field. The authors focus on the heterogeneity and complexity that characterize third language acquisition, multilingual learning and teaching. As the issues addressed in this book intersect, it represents an asset and therefore the texts will be of great relevance for the scientific community. Part I presents different topics of L3 acquisition, such as syntax, phonology, working memory and selective attention, and lexicon. Part II comprises texts that show how the research on language acquisition informs pedagogical issues. For instance, the role of the knowledge of previous languages in the teaching of L3, the attitudes of multilingual teachers to plurilingual approaches, and the benefits of crosslinguistic pedagogy versus classroom monolingual bias. In sequence, Part III consists of texts on individual learning strategies, such as motivation and attitudes, crosslinguistic awareness, and students’ perceptions about teachers’ “plurilingual nonnativism”

Institutional Repository of the Freie Universität Berlin

Segmentation and 3D reconstruction of the vocal tract from MR images - a comparative study

Author: D. R. Freitas
I. M. Ramos
João Manuel R. S. Tavares
S. R. Ventura
Publication venue
Publication date: 01/01/2010
Field of study

Speech production is an important human function involving a set of organs with specific morphological and dynamic aspects. The inter-speaker variability, the coarticulation or the nasality are some interesting aspects to improve a realistic 3D modeling of the vocal tract. For this, the understanding of the mechanism of speech production is crucial, as the current image data is not sufficient to reproduce truthfully the speakers anatomy and articulation. Hence, the goal of 3D modeling is to generate the complete geometrical and dynamical information concerning the vocal tract from medical images, such as from magnetic reso-nance imaging (MRI). This work aims to describe and compare two different segmentation techniques to at-tain the 3D shape of the vocal tract during speech production from MR images: the former based on manual tracing of the vocal tract contours and the latter based on image thresholding. Thus, the segmented cross-sectional areas were measured, and 3D models were built from the sagittal data by blending the contours ob-tained from the two segmentation techniques. The mean error of the measures computed were low for both segmentation techniques, which let us conclude that the techniques are useful to evaluate the vocal tract ge-ometry accurately. Additionally, the 3D models built using both segmentation techniques were also very similar and truthful. However, when the coronal data was used, various difficulties occurred

Repositório Aberto da Universidade do Porto

Multilingualism and third language acquisition : Learning and teaching trends

Author: Alexandre Nélia
Pinto Jorge
Publication venue: Language Science Press
Publication date: 01/01/2021
Field of study

ZENODO

Universidade de Lisboa: Repositório.UL

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Analyzing speech in both time and space : generalized additive mixed models can uncover systematic patterns of variation in vocal tract shape in real-time MRI

Author: Carignan Christopher (R18263)
Frahm Jens
Harrington Jonathan
Hoole Phil
Joseph Arun
Kunay Esther
Pouplier Marianne
Voit Dirk
Publication venue: 'Ubiquity Press, Ltd.'
Publication date: 01/01/2020
Field of study

We present a method of using generalized additive mixed models (GAMMs) to analyze midsagittal vocal tract data obtained from real-time magnetic resonance imaging (rt-MRI) video of speech production. Applied to rt-MRI data, GAMMs allow for observation of factor effects on vocal tract shape throughout two key dimensions: time (vocal tract change over the temporal course of a speech segment) and space (location of change within the vocal tract). Examples of this method are provided for rt-MRI data collected at a temporal resolution of 20 ms and a spatial resolution of 1.41 mm, for 36 native speakers of German. The rt-MRI data were quantified as 28-point semi-polar-grid aperture functions. Three test cases are provided as a way of observing vocal tract differences between: (1) /aː/ and /iː/, (2) /aː/ and /aɪ/, and (3) accentuated and unstressed /aː/. The results for each GAMM are independently validated using functional linear mixed models (FLMMs) constructed from data obtained at 20% and 80% of the vowel interval. In each case, the two methods yield similar results. In light of the method similarities, we propose that GAMMs are a robust, powerful, and interpretable method of simultaneously analyzing both temporal and spatial effects in rt-MRI video of speech

Directory of Open Access Journals

UCL Discovery

Western Sydney ResearchDirect

MPG.PuRe

Ressonância magnética em estudos de produção de fala

Author: Martins Paula Maria Vaz
Publication venue: Universidade de Aveiro
Publication date: 25/05/2016
Field of study

Doutoramento em Ciências e Tecnologias da SaúdeEstudar os mecanismos subjacentes à produção de fala é uma tarefa complexa e exigente, requerendo a obtenção de dados mediante a utilização de variadas técnicas, onde se incluem algumas modalidades imagiológicas. De entre estas, a Ressonância Magnética (RM) tem ganho algum destaque, nos últimos anos, posicionando-se como uma das mais promissoras no domínio da produção de fala. Um importante contributo deste trabalho prende-se com a otimização e implementação de protocolos (RM) e proposta de estratégias de processamento de imagem ajustados aos requisitos da produção de fala, em geral, e às especificidades dos diferentes sons. Para além disso, motivados pela escassez de dados para o Português Europeu (PE), constitui-se como objetivo a obtenção de dados articulatórios que permitam complementar informação já existente e clarificar algumas questões relativas à produção dos sons do PE (nomeadamente, consoantes laterais e vogais nasais). Assim, para as consoantes laterais foram obtidas imagens RM (2D e 3D), através de produções sustidas, com recurso a uma sequência Eco de Gradiente (EG) rápida (3D VIBE), no plano sagital, englobando todo o trato vocal. O corpus, adquirido por sete falantes, contemplou diferentes posições silábicas e contextos vocálicos. Para as vogais nasais, foram adquiridas, em três falantes, imagens em tempo real com uma sequência EG - Spoiled (TurboFLASH), nos planos sagital e coronal, obtendo-se uma resolução temporal de 72 ms (14 frames/s). Foi efetuada aquisição sincronizada das imagens com o sinal acústico mediante utilização de um microfone ótico. Para o processamento e análise de imagem foram utilizados vários algoritmos semiautomáticos. O tratamento e análise dos dados permitiu efetuar uma descrição articulatória das consoantes laterais, ancorada em dados qualitativos (e.g., visualizações 3D, comparação de contornos) e quantitativos que incluem áreas, funções de área do trato vocal, extensão e área das passagens laterais, avaliação de efeitos contextuais e posicionais, etc. No que respeita à velarização da lateral alveolar /l/, os resultados apontam para um /l/ velarizado independentemente da sua posição silábica. Relativamente ao /L/, em relação ao qual a informação disponível era escassa, foi possível verificar que a sua articulação é bastante mais anteriorizada do que tradicionalmente descrito e também mais extensa do que a da lateral alveolar. A resolução temporal de 72 ms conseguida com as aquisições de RM em tempo real, revelou-se adequada para o estudo das características dinâmicas das vogais nasais, nomeadamente, aspetos como a duração do gesto velar, gesto oral, coordenação entre gestos, etc. complementando e corroborando resultados, já existentes para o PE, obtidos com recurso a outras técnicas instrumentais. Para além disso, foram obtidos novos dados de produção relevantes para melhor compreensão da nasalidade (variação área nasal/oral no tempo, proporção nasal/oral). Neste estudo, fica patente a versatilidade e potencial da RM para o estudo da produção de fala, com contributos claros e importantes para um melhor conhecimento da articulação do Português, para a evolução de modelos de síntese de voz, de base articulatória, e para aplicação futura em áreas mais clínicas (e.g., perturbações da fala).The study of the mechanisms underlying speech production is a complex and demanding task that requires data gathered using different techniques and including image acquisition. Among the different imaging modalities used, Magnetic Resonance Imaging (MRI) assumed an important role, in recent years, positioning itself as one of the most promising techniques and providing a wealth of information concerning speech production. An important contribution of this research is the optimization and implementation of MRI protocols and the proposal of adequate image processing techniques that can meet the requirements imposed by speech production and the specificities of different sounds. Additionally, motivated by the scarcity of data for European Portuguese (EP), image acquisitions were performed to gather articulatory data to complement and clarify previous information relating to the production of EP sounds (namely, lateral consonants and nasal vowels). For lateral consonants, MR images encompassing the entire vocal tract (VT), both in the midsagittal plane and in 3D, were acquired, during sustained productions, using a spoiled Gradient Echo (GE) sequence - 3D VIBE. The corpus, obtained for seven EP speakers, considered the lateral consonants in different syllabic contexts and syllable positions. For nasal vowels a corpus considering different syllabic positions and contexts was acquired, for three speakers, using Real-time MRI (RT- MRI) images by means of a GE - spoiled (TurboFLASH) sequence, obtained in the sagittal and coronal planes, with a temporal resolution of 72 ms (14 frames/s). A synchronized audio signal was acquired, inside the MR scanner using a fiberoptic microphone. Data processing and analysis was achieved using several semi-automatic algorithms. Analysis of the acquired data allowed a detailed articulatory description of the lateral consonants anchored in both qualitative (e.g., 3D visualization, contour comparison) and quantitative data such as, vocal tract area functions, extension and area of lateral channels and evaluation of positional and contextual effects. Specifically, for the alveolar lateral /l/, as regards velarization, the gathered data points to a variety regardless of its syllabic position. For the /L/, in respect of which the information is very scarce, evidence shows the articulation is far more fronted than traditionally described and more extensive than that observed for the alveolar lateral. The temporal resolution of 72 ms, achieved with RT- MRI acquisitions, proved to be suitable to address the study of dynamic characteristics of nasal vowels, namely velar and oral gestures, temporal coordination between gestures and durational aspects, complementing existing data for the EP, obtained using other instrumental techniques. In addition, new relevant data were attained providing additional contributions for a deep knowledge of nasality (e.g., nasal/oral areas over time, nasal/oral proportion). The work presented demonstrates the versatility and potential of MRI when applied to speech production studies and provides important contributions to a better understanding of the articulation of EP, to the development of models supporting the improvement of articulatory based speech synthesis and to future applications in clinical areas (e.g., speech disorders)

Repositório Institucional da Universidade de Aveiro

Phonetic events from the labeling the european Portuguese database for speech synthesis, FEUP/IPB-DB

Author: Barros Maria João
Braga Daniela
Freitas Diamantino Silva
Latsch Vagner
Teixeira João Paulo
Publication venue: 'The International Fiscal Association of Korea'
Publication date: 01/01/2001
Field of study

In this paper a labeled new speech signal database (FEUP/IPB-DB) in Standard European Portuguese (hereafter SEP) is presented. The objective of this work is, on one hand, to provide phonetic material for Text-to-Speech (TTS) systems construction, either from the start or to improve the quality of existing ones, and, on the other hand, to place at service of the SEP scientific community a phonetically and prosodically valuable speech corpus, essential for Speech Synthesis or Phonetics research. Our purpose is to make it available for the scientific community, since there isn’t any other DB of its kind for EP. The main features of the database will be described as well as some basic statistical aspects. A discussion of some methodological problems and some observed phenomena in experimental phonetics deriving from the speech signal labeling is also done. The approach in our work is to produce a resource that can be further improved in subsequent steps with minimal re-work. The phonetic, linguistic and technical consistency are guaranteed through the involvement of a multidisciplinary team

CiteSeerX

Biblioteca Digital do IPB