378 research outputs found
Investigating sources of phonological rarity and instability: a study of the palatal lateral approximant in Brazilian Portuguese
The palatal lateral is a rare sound in the world's languages; a review of the literature reveals just 23 languages that currently possess the palatal lateral. Similarly, only 15 (or 3.33%) of the languages in the UCLA Phonological Segment Inventory Database (UPSID) (Maddieson and Precoda, 1991) can claim to currently possess the palatal lateral. While UPSID reports that an additional five languages (Basque, Guarani, Iate, Spanish, Turkish) possess the palatal lateral, these languages have either lost the palatal lateral or were included erroneously. Understanding the production and perception of rare speech sounds is important for understanding the distribution of speech sounds cross-linguistically, especially with regards to the establishment of a single phonetic alphabet (i.e. the International Phonetic Alphabet (IPA)) that can be used to describe and transcribe the languages of the world (Ladefoged and Everett, 1996). An investigation of rare speech sounds can also reveal important findings regarding the physical limitations of the vocal tract and human auditory system.
Given that the palatal lateral is a rare speech sound, a complete description of the articulation, acoustics, and perception of this sound does not currently exist. Accounts of the palatal lateral vary with regards to terminology; the palatal lateral has also been referred to as a so-called "phonemically" palatalized lateral (Zilyns'kyj, 1979), a laminal post-alveolar lateral (Ladefoged and Maddieson, 1996), and an alveolopalatal lateral (Recasens, 2013). Furthermore, current literature also does not distinguish between the palatal lateral and a palatalized lateral. The lack of agreement in literature regarding terminology can present problems when attempting to assess whether a palatal lateral in one language is similar to a palatal lateral in another language. This dissertation provides a comprehensive description of the palatal lateral, as a means of initiating cross-linguistic comparisons of the palatal lateral as well as understanding the difference between a palatal and palatalized lateral.
A two-part study of the articulation and acoustics of the palatal lateral in Brazilian Portuguese (BP) was undertaken in this dissertation. Articulatory data was collected using electromagenetic articulography (EMA) from 10 female native speakers of BP from São Paulo state in Brazil, which permitted the simultaneous collection of acoustic information. Study 1 investigated the articulation of the palatal lateral through a battery of measures and compares the palatal lateral against the palatalized lateral approximant, alveolar lateral approximant, palatal approximant, palatal nasal, palatalized nasal, and alveolar nasal. Study 2 analyzes the acoustics of the palatal lateral in comparison to the palatalized lateral approximant, alveolar lateral approximant, and palatal approximant.
A third study was included in the appendix. This study incorporates a phone identification task to understand the role of acoustic saliency in the rareness of the palatal lateral, i.e. compared to other palatal sounds, is the palatal lateral more likely to be misidentified and if so, as which sounds? This task also investigates whether there is a perceived difference between the palatal and palatalized lateral that may not be captured by Study 1 and 2, in addition to whether native speakers of BP are better at distinguishing the two sounds than non-native speakers (here, native speakers of American English). The palatal lateral was compared to the palatalized lateral, palatal approximant, alveolar lateral approximant, palatal nasal, palatalized nasal, alveolar nasal, voiced alveolar stop, and voiced palatalized alveolar stop. 25 (11 male, 14 female) natives speakers of BP and 20 (11 male, 9 female) native speakers of American English with no extensive exposure to BP participated in this study.
Results from Study 1 show that the palatal lateral is articulated laminally with a high front tongue body and concave anterior tongue shape that gradually becomes straighter as the phone progresses. Acoustic results in Study 2 indicate a median F1, F2, and F3 of 367 Hz, 1954 Hz, and 3035 Hz respectively for female speakers of BP. Statistical analysis reveals little or no evidence of significant difference between the palatal lateral and palatalized lateral with regards to the shape of the tongue body, duration of the phone, or formant frequencies.
The perception study included in the appendix finds that while both native and non-native speakers of BP distinguish between the palatal lateral and palatalized lateral at chance level, native speakers of BP perform better than the non-native speakers at correctly identifying the palatal and palatalized nasal. This study also finds that of all the sounds included in this task, the palatal and palatalized lateral are the most likely to be misidentified as the palatal approximant for both participant groups, with the addition of -3 dB of speech-shaped noise greatly increasing the rate of confusion. However, the palatalized lateral is inaccurately identified as a palatal approximant at a confusion rate nearly double or more than the palatal lateral.
This dissertation reveals that the palatal and palatalized lateral are essentially the same sound in BP. Furthermore, there is no evidence that indicates that the palatal or palatalized lateral are composed of two separate phones, i.e. an alveolar lateral approximant followed by a palatal approximant. Findings from the perception study support the proposal that yeísmo (i.e. the merger of the palatal lateral in favor of the palatal approximant (Colantoni, 2001; Hualde et al., 2005)) occurs because lateral sounds are less robust against added noise than nasal sounds. I argue here that this contributes directly to the rareness of the palatal lateral
Estudo acústico das consoantes líquidas do Português Europeu: evidências temporais e espectrais
In this paper, we present data resulting from a temporal and spectral analysis of European Portuguese liquid consonants (/l, ʎ, ɾ, ʀ/) produced in onset position by twonative speakers. The results obtained for the alveolar lateral indicate a velarized /l/ in onset position, as other studies for European Portuguese had already showed. It was also observed that /ɾ/ is a liquid with highest values of F1 frequency and is the consonantwith the shortest duration. The highest frequency of F2 was obtained for the palatal lateral. In this study, the values of F3 and F4 frequency suggest no major differences between liquid consonants. There was some inter-speaker variability.info:eu-repo/semantics/publishedVersio
L2 speech learning of European Portuguese /l/ and /ɾ/ by L1-Mandarin learners: experimental evidence and theoretical modelling
It has been long recognized that the poor distinction between /l/ and /ɾ/ is one
of the most perceptible characteristics in Chinese-accented Portuguese. Recent
empirical research revealed that this notorious L2 speech learning difficulty
goes beyond the confusion between two L2 categories, as L1-Mandarin learners’
acquisition of Portuguese /l/ and /ɾ/ seems to be subject to the interaction
among different prosodic positions, speech modalities and representational
levels. This thesis aims to deepen our current understanding of this L2 speech
learning process, by exploring what constrains the development of L2
phonological categories across syllable positions and how different modalities
interact during this process. To achieve this goal, both experimental tasks and
theoretical modelling were employed.
The first study of this thesis explores the role of cross-linguistic influence
and orthography on L2 category formation. In order to elicit cross-linguistic
influence directly, a delayed-imitation task was performed with L1-Mandarin
naïve listeners. This task examined how the Mandarin phonology parses the
Portuguese input ([l], [ɾ]) in intervocalic onset and in word-internal coda
position. Moreover, whether orthography plays a role during the construction
of L2 phonological representation was tested by manipulating the input types
that were given in the experiment (auditory input alone vs. auditory + written
input). Our study shows that naïve Mandarin listeners’ responses corroborated
with that of L1-Mandarin learners, suggesting that cross-linguistic influence is
responsible for the observed L2 prosodic effects. Moreover, the Mandarin [ɻ] (a
repair strategy for /ɾ/) occurred almost exclusively when the written form was
given, providing evidence for the cross-linguistic interaction between
phonological categorization and orthography during the construction of L2
categories.
In the second study, we first investigate the interaction between speech
perception and production in L2 speech learning, by examining whether the L2
deviant productions stem from misperception and whether the order of
acquisition in L2 speech perception mirrors that in production. Secondly, we
test whether L2 phonological categories remain malleable at a mid-late stage of
L2 speech learning. Two perceptual experiments were performed to test L1-Mandarin learners on their discrimination ability between the target
Portuguese form and the deviant form employed in L2 production. Expanding
on prior research, in this study, the perceptual motivation for L2 speech
difficulties was assessed in different syllable constituents (onset and coda) and
at both segmental and suprasegmental levels (structural modification). The
results demonstrate that some deviant forms observed in L2 production indeed
have a perceptual motivation ([w] for the velarised lateral; [l] and [ɾə] for the
tap), while some others cannot be attributed to misperception (deletion of
syllable-final tap). Furthermore, learners confused the intervocalic /l/ and /ɾ/
bidirectionally in perception, while in production they never misproduced the
lateral (/ɾ/ → [l], */l/ → [ɾ]), revealing a mismatch between two speech
modalities. By contrast, the order of acquisition (/ɾ/coda > /ɾ/onset) was shown to
be consistent in L2 perception and production. The correspondence and
discrepancy between the two speech modalities signal a complex relationship
between L2 speech perception and production. To assess the plasticity of L2
categories /l/ and /ɾ/, two groups of L1-Mandarin learners who differ
substantially in terms of L2 experience were recruited in the perceptual tasks.
Our study shows that both groups behaved similarly in terms of the
discrimination performance. No evidence for a role of L2 experience was found.
The implication of this null result on L2 phonological development is discussed.
The third study of the thesis aims to contribute to bridging the gap between
the L2 experimental evidence and formal theories. Adopting the Bidirectional
Phonology and Phonetics Model, we formalise some of the experimental
findings that cannot be elucidated by current L2 speech theories, namely, the
between and within-subject variation in L2 phonological categorization; the
interaction between phonological categorization and orthography during L2
category construction; and the asymmetry between L2 perception and
production.
Overall, this thesis sheds light on the complex nature of L2 phonological
acquisition and provides a formal account of how different modalities interact
in shaping L2 speech learning. Moreover, it puts forward testable predictions
for future research and suggestions for improving foreign language
teaching/training methodologies.É bem conhecido o facto de as trocas associadas a /l/ e /ɾ/ constituírem uma
das caraterísticas mais percetíveis no português articulado pelos aprendentes
chineses. Recentemente, estudos empíricos revelam que a dificuldade por parte
dos aprendentes chineses não se restringe à discriminação moderada entre as
duas categorias da L2, dado que a aquisição de /l/ e /ɾ/ do português por
aprendentes chineses parece estar sujeita à interação entre contextos
prosódicos, entre modalidades de fala e entre níveis representacionais
diferentes. Esta tese visa aprofundar a nossa compreensão deste processo da
aquisição fonológica L2, explorando o que condiciona o desenvolvimento das
categorias fonológicas L2 em diferentes constituintes silábicos e de que modo
as modalidades interagem durante este processo, recorrendo para tal a tarefas
experimentais bem como a formalização teórica.
O primeiro estudo averigua o papel da influência interlinguística e o da
ortografia na construção das categorias de L2. Para elicitar a influência
interlinguística diretamente, uma tarefa de imitação retardada foi aplicada aos
falantes nativos do mandarim sem conhecimento de português, investigando
assim como a fonologia do mandarim categoriza o input do português ([l], [ɾ])
em ataque simples intervocálico e em coda medial. Para além disso, a influência
ortográfica na construção de representações fonológicas em L2 foi examinada
através da manipulação do tipo do input apresentado na experiência (input
auditivo vs. input auditivo + ortográfico). Os resultados da situação
experimental em que os participantes receberam input de ambos os tipos
replicaram o efeito prosódico observado na literatura, evidenciando a interação
entre categorização fonológica e ortografia na construção das categorias de L2.
No segundo estudo, investigamos a interação entre a perceção e a produção
de fala na aquisição das líquidas do PE por aprendentes chineses e a
plasticidade destas categorias fonológicas, respondendo às questões seguintes:
1) as produções desviantes de L2 resultam da perceção incorreta? 2) a ordem
da aquisição em L2 é consistente na perceção e na produção? 3) as categorias
da L2 permanecem maleáveis numa fase intermédia da aquisição? Duas tarefas
percetivas foram conduzidas para testar a capacidade percetiva dos
aprendentes nativos do mandarim em relação à discriminação entre a forma
alvo do português e as formas desviantes utilizadas na produção. No presente
estudo, a motivação percetiva das dificuldades em L2 foi testada nos constituintes silábicos diferentes (ataque simples e coda) e nos níveis segmental e suprassegmental (modificação estrutural). Os resultados demonstram que algumas formas desviantes que os aprendentes chineses produzem têm uma
motivação percetiva (i.e. [w] para a lateral velarizada; [l] e [ɾə] para a vibrante
alveolar), enquanto outras não podem ser analisadas como casos de perceção
incorreta (como é o caso do o apagamento da vibrante em coda). Para além
disso, na posição intervocálica, os aprendentes manifestam dificuldade na
discriminação entre /l/ e /ɾ/ de forma bidirecional, mas, na produção, a lateral
nunca é produzida incorretamente (/ɾ/ → [l], */l/ → [ɾ]). Tal revela uma
divergência entre as duas modalidades de fala. Por contraste, mostrou-se que a
ordem da aquisição (/ɾ/coda > /ɾ/ataque) é consistente na perceção e na produção
da L2. A correspondência e a discrepância entre as duas modalidades de fala,
sinalizam uma relação complexa entre a perceção e a produção na aquisição
fonológica de L2. Em relação à questão da plasticidade das categorias de L2,
recrutaram-se para as tarefas percetivas dois grupos de aprendentes nativos do
mandarim que se diferenciavam substancialmente em termos da experiência
em L2. Não se encontrou um efeito significativo da experiência da L2. A
implicação deste resultado nulo no desenvolvimento fonológico de L2 foi
discutida.
O terceiro estudo desta tese tem como objetivo contribuir para a
colmatação das lacunas entre estudos empíricos de L2 e as teorias formais.
Adotando o Modelo Bidirecional de Fonologia e Fonética, formalizamos os
resultados experimentais que as teorias atuais da aquisição fonológica de L2
não conseguem explicar, nomeadamente, a variação inter e intra-sujeitos na
categorização fonológica em L2; a interação entre categorização fonológica e
ortografia na construção das categorias na L2; a assimetria entre a perceção e a
produção na L2.
Em suma, esta tese contribui com dados empíricos para a discussão da
relação complexa entre a perceção, produção e ortografia na aquisição
fonológica de L2 e formaliza a interação entre essas modalidades através de um
modelo linguístico generativo. Além disso, apresentam-se predições testáveis
para investigação futura e sugestões para o aperfeiçoamento das metodologias
de ensino/treino da língua não materna
Learning and teaching trends
The purpose of this book is to present recent studies in the field of multilingualism and L3, bringing together contributions from an international group of specialists from Austria, Canada, Germany, Portugal, Spain, Switzerland, Turkey, and United States. The main focuses of the articles are three: language acquisition, language learning and teaching.
A collection of theoretical and empirical articles from scholars of multilingualism and language acquisition makes the book a significant resource as the papers present a wide perspective from main theories to current issues, reflecting new trends in the field.
The authors focus on the heterogeneity and complexity that characterize third language acquisition, multilingual learning and teaching. As the issues addressed in this book intersect, it represents an asset and therefore the texts will be of great relevance for the scientific community.
Part I presents different topics of L3 acquisition, such as syntax, phonology, working memory and selective attention, and lexicon. Part II comprises texts that show how the research on language acquisition informs pedagogical issues. For instance, the role of the knowledge of previous languages in the teaching of L3, the attitudes of multilingual teachers to plurilingual approaches, and the benefits of crosslinguistic pedagogy versus classroom monolingual bias. In sequence, Part III consists of texts on individual learning strategies, such as motivation and attitudes, crosslinguistic awareness, and students’ perceptions about teachers’ “plurilingual nonnativism”
Segmentation and 3D reconstruction of the vocal tract from MR images - a comparative study
Speech production is an important human function involving a set of organs with specific morphological and dynamic aspects. The inter-speaker variability, the coarticulation or the nasality are some interesting aspects to improve a realistic 3D modeling of the vocal tract. For this, the understanding of the mechanism of speech production is crucial, as the current image data is not sufficient to reproduce truthfully the speakers anatomy and articulation. Hence, the goal of 3D modeling is to generate the complete geometrical and dynamical information concerning the vocal tract from medical images, such as from magnetic reso-nance imaging (MRI). This work aims to describe and compare two different segmentation techniques to at-tain the 3D shape of the vocal tract during speech production from MR images: the former based on manual tracing of the vocal tract contours and the latter based on image thresholding. Thus, the segmented cross-sectional areas were measured, and 3D models were built from the sagittal data by blending the contours ob-tained from the two segmentation techniques. The mean error of the measures computed were low for both segmentation techniques, which let us conclude that the techniques are useful to evaluate the vocal tract ge-ometry accurately. Additionally, the 3D models built using both segmentation techniques were also very similar and truthful. However, when the coronal data was used, various difficulties occurred
Multilingualism and third language acquisition : Learning and teaching trends
The purpose of this book is to present recent studies in the field of multilingualism and L3,
bringing together contributions from an international group of specialists from Austria,
Canada, Germany, Portugal, Spain, Switzerland, Turkey, and United States. The main
focuses of the articles are three: language acquisition, language learning and teaching.
A collection of theoretical and empirical articles from scholars of multilingualism and
language acquisition makes the book a significant resource as the papers present a wide
perspective from main theories to current issues, reflecting new trends in the field.
The authors focus on the heterogeneity and complexity that characterize third language acquisition, multilingual learning and teaching. As the issues addressed in this
book intersect, it represents an asset and therefore the texts will be of great relevance
for the scientific community.
Part I presents different topics of L3 acquisition, such as syntax, phonology, working
memory and selective attention, and lexicon. Part II comprises texts that show how the
research on language acquisition informs pedagogical issues. For instance, the role of
the knowledge of previous languages in the teaching of L3, the attitudes of multilingual
teachers to plurilingual approaches, and the benefits of crosslinguistic pedagogy versus
classroom monolingual bias. In sequence, Part III consists of texts on individual learning strategies, such as motivation and attitudes, crosslinguistic awareness, and students’
perceptions about teachers’ “plurilingual nonnativism”.
All these chapters include several different languages in contact in an acquisition/learning context: Basque, English, French, German, Italian, Ladin, Portuguese, Russian, Spanish, and Turkish.info:eu-repo/semantics/publishedVersio
Analyzing speech in both time and space : generalized additive mixed models can uncover systematic patterns of variation in vocal tract shape in real-time MRI
We present a method of using generalized additive mixed models (GAMMs) to analyze midsagittal vocal tract data obtained from real-time magnetic resonance imaging (rt-MRI) video of speech production. Applied to rt-MRI data, GAMMs allow for observation of factor effects on vocal tract shape throughout two key dimensions: time (vocal tract change over the temporal course of a speech segment) and space (location of change within the vocal tract). Examples of this method are provided for rt-MRI data collected at a temporal resolution of 20 ms and a spatial resolution of 1.41 mm, for 36 native speakers of German. The rt-MRI data were quantified as 28-point semi-polar-grid aperture functions. Three test cases are provided as a way of observing vocal tract differences between: (1) /aː/ and /iː/, (2) /aː/ and /aɪ/, and (3) accentuated and unstressed /aː/. The results for each GAMM are independently validated using functional linear mixed models (FLMMs) constructed from data obtained at 20% and 80% of the vowel interval. In each case, the two methods yield similar results. In light of the method similarities, we propose that GAMMs are a robust, powerful, and interpretable method of simultaneously analyzing both temporal and spatial effects in rt-MRI video of speech
Ressonância magnética em estudos de produção de fala
Doutoramento em Ciências e Tecnologias da SaúdeEstudar os mecanismos subjacentes à produção de fala é uma tarefa complexa e
exigente, requerendo a obtenção de dados mediante a utilização de variadas técnicas,
onde se incluem algumas modalidades imagiológicas. De entre estas, a Ressonância
Magnética (RM) tem ganho algum destaque, nos últimos anos, posicionando-se como
uma das mais promissoras no domínio da produção de fala. Um importante contributo
deste trabalho prende-se com a otimização e implementação de protocolos (RM) e
proposta de estratégias de processamento de imagem ajustados aos requisitos da
produção de fala, em geral, e às especificidades dos diferentes sons. Para além disso,
motivados pela escassez de dados para o Português Europeu (PE), constitui-se como
objetivo a obtenção de dados articulatórios que permitam complementar informação já
existente e clarificar algumas questões relativas à produção dos sons do PE
(nomeadamente, consoantes laterais e vogais nasais).
Assim, para as consoantes laterais foram obtidas imagens RM (2D e 3D), através de
produções sustidas, com recurso a uma sequência Eco de Gradiente (EG) rápida (3D
VIBE), no plano sagital, englobando todo o trato vocal. O corpus, adquirido por sete
falantes, contemplou diferentes posições silábicas e contextos vocálicos. Para as
vogais nasais, foram adquiridas, em três falantes, imagens em tempo real com uma
sequência EG - Spoiled (TurboFLASH), nos planos sagital e coronal, obtendo-se uma
resolução temporal de 72 ms (14 frames/s). Foi efetuada aquisição sincronizada das
imagens com o sinal acústico mediante utilização de um microfone ótico. Para o
processamento e análise de imagem foram utilizados vários algoritmos
semiautomáticos.
O tratamento e análise dos dados permitiu efetuar uma descrição articulatória das
consoantes laterais, ancorada em dados qualitativos (e.g., visualizações 3D,
comparação de contornos) e quantitativos que incluem áreas, funções de área do trato
vocal, extensão e área das passagens laterais, avaliação de efeitos contextuais e
posicionais, etc. No que respeita à velarização da lateral alveolar /l/, os resultados
apontam para um /l/ velarizado independentemente da sua posição silábica.
Relativamente ao /L/, em relação ao qual a informação disponível era escassa, foi
possível verificar que a sua articulação é bastante mais anteriorizada do que
tradicionalmente descrito e também mais extensa do que a da lateral alveolar. A
resolução temporal de 72 ms conseguida com as aquisições de RM em tempo real,
revelou-se adequada para o estudo das características dinâmicas das vogais nasais,
nomeadamente, aspetos como a duração do gesto velar, gesto oral, coordenação entre
gestos, etc. complementando e corroborando resultados, já existentes para o PE,
obtidos com recurso a outras técnicas instrumentais. Para além disso, foram obtidos
novos dados de produção relevantes para melhor compreensão da nasalidade
(variação área nasal/oral no tempo, proporção nasal/oral).
Neste estudo, fica patente a versatilidade e potencial da RM para o estudo da produção
de fala, com contributos claros e importantes para um melhor conhecimento da
articulação do Português, para a evolução de modelos de síntese de voz, de base
articulatória, e para aplicação futura em áreas mais clínicas (e.g., perturbações da fala).The study of the mechanisms underlying speech production is a complex and
demanding task that requires data gathered using different techniques and including
image acquisition. Among the different imaging modalities used, Magnetic Resonance
Imaging (MRI) assumed an important role, in recent years, positioning itself as one of
the most promising techniques and providing a wealth of information concerning speech
production. An important contribution of this research is the optimization and
implementation of MRI protocols and the proposal of adequate image processing
techniques that can meet the requirements imposed by speech production and the
specificities of different sounds. Additionally, motivated by the scarcity of data for
European Portuguese (EP), image acquisitions were performed to gather articulatory
data to complement and clarify previous information relating to the production of EP
sounds (namely, lateral consonants and nasal vowels).
For lateral consonants, MR images encompassing the entire vocal tract (VT), both in the
midsagittal plane and in 3D, were acquired, during sustained productions, using a
spoiled Gradient Echo (GE) sequence - 3D VIBE. The corpus, obtained for seven EP
speakers, considered the lateral consonants in different syllabic contexts and syllable
positions. For nasal vowels a corpus considering different syllabic positions and contexts
was acquired, for three speakers, using Real-time MRI (RT- MRI) images by means of a
GE - spoiled (TurboFLASH) sequence, obtained in the sagittal and coronal planes, with
a temporal resolution of 72 ms (14 frames/s). A synchronized audio signal was acquired,
inside the MR scanner using a fiberoptic microphone. Data processing and analysis was
achieved using several semi-automatic algorithms.
Analysis of the acquired data allowed a detailed articulatory description of the lateral
consonants anchored in both qualitative (e.g., 3D visualization, contour comparison) and
quantitative data such as, vocal tract area functions, extension and area of lateral
channels and evaluation of positional and contextual effects. Specifically, for the alveolar
lateral /l/, as regards velarization, the gathered data points to a variety regardless of its
syllabic position. For the /L/, in respect of which the information is very scarce, evidence
shows the articulation is far more fronted than traditionally described and more
extensive than that observed for the alveolar lateral.
The temporal resolution of 72 ms, achieved with RT- MRI acquisitions, proved to be
suitable to address the study of dynamic characteristics of nasal vowels, namely velar
and oral gestures, temporal coordination between gestures and durational aspects,
complementing existing data for the EP, obtained using other instrumental techniques.
In addition, new relevant data were attained providing additional contributions for a deep
knowledge of nasality (e.g., nasal/oral areas over time, nasal/oral proportion).
The work presented demonstrates the versatility and potential of MRI when applied to
speech production studies and provides important contributions to a better
understanding of the articulation of EP, to the development of models supporting the
improvement of articulatory based speech synthesis and to future applications in clinical
areas (e.g., speech disorders)
Phonetic events from the labeling the european Portuguese database for speech synthesis, FEUP/IPB-DB
In this paper a labeled new speech signal database (FEUP/IPB-DB) in Standard European Portuguese (hereafter SEP) is presented. The objective of this work is, on one hand, to provide phonetic material for Text-to-Speech (TTS) systems construction, either from the start or to improve the quality of existing ones, and, on the other hand, to place at service of the SEP scientific community a phonetically and prosodically valuable speech corpus, essential for Speech Synthesis or Phonetics research. Our purpose is to make it available for the scientific community, since there isn’t any other DB of its kind for EP. The main features of the database will be described as well as some basic statistical aspects. A discussion of some methodological problems and some observed phenomena in experimental phonetics deriving from the speech signal labeling is also done. The approach in our work is to produce a resource that can be further improved in subsequent steps with minimal re-work. The phonetic, linguistic and technical consistency are guaranteed through the involvement of a multidisciplinary team
- …