572 research outputs found

    Explicit and implicit aptitude effects on second language speech learning: scrutinizing segmental and suprasegmental sensitivity and performance via behavioural and neurophysiological measures

    Get PDF
    The current study examines the role of cognitive and perceptual individual differences (i.e., aptitude) in second language (L2) pronunciation learning, when L2 learners’ varied experience background is controlled for. A total of 48 Chinese learners of English in the UK were assessed for their sensitivity to segmental and suprasegmental aspects of speech on explicit and implicit modes via behavioural (language/music aptitude tests) and neurophysiological (electroencephalography) measures. Subsequently, the participants’ aptitude profiles were compared to the segmental and suprasegmental dimensions of their L2 pronunciation proficiency analyzed through rater judgements and acoustic measurements. According to the results, the participants’ segmental attainment was associated not only with explicit aptitude (phonemic coding), but also with implicit aptitude (enhanced neural encoding of spectral peaks). Whereas the participants’ suprasegmental attainment was linked to explicit aptitude (rhythmic imagery) to some degree, it was primarily influenced by the quality and quantity of their most recent L2 learning experience

    L2 speech learning of European Portuguese /l/ and /ɾ/ by L1-Mandarin learners: experimental evidence and theoretical modelling

    Get PDF
    It has been long recognized that the poor distinction between /l/ and /ɾ/ is one of the most perceptible characteristics in Chinese-accented Portuguese. Recent empirical research revealed that this notorious L2 speech learning difficulty goes beyond the confusion between two L2 categories, as L1-Mandarin learners’ acquisition of Portuguese /l/ and /ɾ/ seems to be subject to the interaction among different prosodic positions, speech modalities and representational levels. This thesis aims to deepen our current understanding of this L2 speech learning process, by exploring what constrains the development of L2 phonological categories across syllable positions and how different modalities interact during this process. To achieve this goal, both experimental tasks and theoretical modelling were employed. The first study of this thesis explores the role of cross-linguistic influence and orthography on L2 category formation. In order to elicit cross-linguistic influence directly, a delayed-imitation task was performed with L1-Mandarin naïve listeners. This task examined how the Mandarin phonology parses the Portuguese input ([l], [ɾ]) in intervocalic onset and in word-internal coda position. Moreover, whether orthography plays a role during the construction of L2 phonological representation was tested by manipulating the input types that were given in the experiment (auditory input alone vs. auditory + written input). Our study shows that naïve Mandarin listeners’ responses corroborated with that of L1-Mandarin learners, suggesting that cross-linguistic influence is responsible for the observed L2 prosodic effects. Moreover, the Mandarin [ɻ] (a repair strategy for /ɾ/) occurred almost exclusively when the written form was given, providing evidence for the cross-linguistic interaction between phonological categorization and orthography during the construction of L2 categories. In the second study, we first investigate the interaction between speech perception and production in L2 speech learning, by examining whether the L2 deviant productions stem from misperception and whether the order of acquisition in L2 speech perception mirrors that in production. Secondly, we test whether L2 phonological categories remain malleable at a mid-late stage of L2 speech learning. Two perceptual experiments were performed to test L1-Mandarin learners on their discrimination ability between the target Portuguese form and the deviant form employed in L2 production. Expanding on prior research, in this study, the perceptual motivation for L2 speech difficulties was assessed in different syllable constituents (onset and coda) and at both segmental and suprasegmental levels (structural modification). The results demonstrate that some deviant forms observed in L2 production indeed have a perceptual motivation ([w] for the velarised lateral; [l] and [ɾə] for the tap), while some others cannot be attributed to misperception (deletion of syllable-final tap). Furthermore, learners confused the intervocalic /l/ and /ɾ/ bidirectionally in perception, while in production they never misproduced the lateral (/ɾ/ → [l], */l/ → [ɾ]), revealing a mismatch between two speech modalities. By contrast, the order of acquisition (/ɾ/coda > /ɾ/onset) was shown to be consistent in L2 perception and production. The correspondence and discrepancy between the two speech modalities signal a complex relationship between L2 speech perception and production. To assess the plasticity of L2 categories /l/ and /ɾ/, two groups of L1-Mandarin learners who differ substantially in terms of L2 experience were recruited in the perceptual tasks. Our study shows that both groups behaved similarly in terms of the discrimination performance. No evidence for a role of L2 experience was found. The implication of this null result on L2 phonological development is discussed. The third study of the thesis aims to contribute to bridging the gap between the L2 experimental evidence and formal theories. Adopting the Bidirectional Phonology and Phonetics Model, we formalise some of the experimental findings that cannot be elucidated by current L2 speech theories, namely, the between and within-subject variation in L2 phonological categorization; the interaction between phonological categorization and orthography during L2 category construction; and the asymmetry between L2 perception and production. Overall, this thesis sheds light on the complex nature of L2 phonological acquisition and provides a formal account of how different modalities interact in shaping L2 speech learning. Moreover, it puts forward testable predictions for future research and suggestions for improving foreign language teaching/training methodologies.É bem conhecido o facto de as trocas associadas a /l/ e /ɾ/ constituírem uma das caraterísticas mais percetíveis no português articulado pelos aprendentes chineses. Recentemente, estudos empíricos revelam que a dificuldade por parte dos aprendentes chineses não se restringe à discriminação moderada entre as duas categorias da L2, dado que a aquisição de /l/ e /ɾ/ do português por aprendentes chineses parece estar sujeita à interação entre contextos prosódicos, entre modalidades de fala e entre níveis representacionais diferentes. Esta tese visa aprofundar a nossa compreensão deste processo da aquisição fonológica L2, explorando o que condiciona o desenvolvimento das categorias fonológicas L2 em diferentes constituintes silábicos e de que modo as modalidades interagem durante este processo, recorrendo para tal a tarefas experimentais bem como a formalização teórica. O primeiro estudo averigua o papel da influência interlinguística e o da ortografia na construção das categorias de L2. Para elicitar a influência interlinguística diretamente, uma tarefa de imitação retardada foi aplicada aos falantes nativos do mandarim sem conhecimento de português, investigando assim como a fonologia do mandarim categoriza o input do português ([l], [ɾ]) em ataque simples intervocálico e em coda medial. Para além disso, a influência ortográfica na construção de representações fonológicas em L2 foi examinada através da manipulação do tipo do input apresentado na experiência (input auditivo vs. input auditivo + ortográfico). Os resultados da situação experimental em que os participantes receberam input de ambos os tipos replicaram o efeito prosódico observado na literatura, evidenciando a interação entre categorização fonológica e ortografia na construção das categorias de L2. No segundo estudo, investigamos a interação entre a perceção e a produção de fala na aquisição das líquidas do PE por aprendentes chineses e a plasticidade destas categorias fonológicas, respondendo às questões seguintes: 1) as produções desviantes de L2 resultam da perceção incorreta? 2) a ordem da aquisição em L2 é consistente na perceção e na produção? 3) as categorias da L2 permanecem maleáveis numa fase intermédia da aquisição? Duas tarefas percetivas foram conduzidas para testar a capacidade percetiva dos aprendentes nativos do mandarim em relação à discriminação entre a forma alvo do português e as formas desviantes utilizadas na produção. No presente estudo, a motivação percetiva das dificuldades em L2 foi testada nos constituintes silábicos diferentes (ataque simples e coda) e nos níveis segmental e suprassegmental (modificação estrutural). Os resultados demonstram que algumas formas desviantes que os aprendentes chineses produzem têm uma motivação percetiva (i.e. [w] para a lateral velarizada; [l] e [ɾə] para a vibrante alveolar), enquanto outras não podem ser analisadas como casos de perceção incorreta (como é o caso do o apagamento da vibrante em coda). Para além disso, na posição intervocálica, os aprendentes manifestam dificuldade na discriminação entre /l/ e /ɾ/ de forma bidirecional, mas, na produção, a lateral nunca é produzida incorretamente (/ɾ/ → [l], */l/ → [ɾ]). Tal revela uma divergência entre as duas modalidades de fala. Por contraste, mostrou-se que a ordem da aquisição (/ɾ/coda > /ɾ/ataque) é consistente na perceção e na produção da L2. A correspondência e a discrepância entre as duas modalidades de fala, sinalizam uma relação complexa entre a perceção e a produção na aquisição fonológica de L2. Em relação à questão da plasticidade das categorias de L2, recrutaram-se para as tarefas percetivas dois grupos de aprendentes nativos do mandarim que se diferenciavam substancialmente em termos da experiência em L2. Não se encontrou um efeito significativo da experiência da L2. A implicação deste resultado nulo no desenvolvimento fonológico de L2 foi discutida. O terceiro estudo desta tese tem como objetivo contribuir para a colmatação das lacunas entre estudos empíricos de L2 e as teorias formais. Adotando o Modelo Bidirecional de Fonologia e Fonética, formalizamos os resultados experimentais que as teorias atuais da aquisição fonológica de L2 não conseguem explicar, nomeadamente, a variação inter e intra-sujeitos na categorização fonológica em L2; a interação entre categorização fonológica e ortografia na construção das categorias na L2; a assimetria entre a perceção e a produção na L2. Em suma, esta tese contribui com dados empíricos para a discussão da relação complexa entre a perceção, produção e ortografia na aquisição fonológica de L2 e formaliza a interação entre essas modalidades através de um modelo linguístico generativo. Além disso, apresentam-se predições testáveis para investigação futura e sugestões para o aperfeiçoamento das metodologias de ensino/treino da língua não materna

    The Distribution Of Disfluencies In Spontaneous Speech: Empirical Observations And Theoretical Implications

    Get PDF
    This dissertation provides an empirical description of the forms and their distribution of disfluencies in spontaneous speech. Although research in this area has received much attention in past four decades, large scale analyses of speech corpora from multiple communication settings, languages, and speaker\u27s cognitive states are still lacking. Understandings of regularities of different kinds of disfluencies based on large speech samples across multiple domains are essential for both theoretical and applied purposes. As an attempt to fill this gap, this dissertation takes the approach of quantitative analysis of large corpora of spontaneous speech. The selected corpora reflect a diverse range of tasks and languages. The dissertation re-examines speech disfluency phenomena, including silent pauses, filled pauses (``um and ``uh ) and repetitions, and provides the empirical basis for future work in both theoretical and applied settings. Results from the study of silent and filled pauses indicate that a potential sociolinguistic variation can in fact be explained from the perspective of the speech planning process. The descriptive analysis of repetitions has identified a new form of repetitive phenomenon: repetitive interpolation. Both the acoustic and textual properties of repetitive interpolation have been documented through rigorous quantitative analysis. The defining features of this phenomenon can be further used in designing speech based applications such as speaker state detection. Although the goal of this descriptive analysis is not to formulate and test specific hypothesis about speech production, potential directions for future research in speech production models are proposed and evaluated. The quantitative methods employed throughout this dissertation can also be further developed into interpretable features in machine learning systems that require automatic processing of spontaneous speech

    The Distribution Of Disfluencies In Spontaneous Speech: Empirical Observations And Theoretical Implications

    Get PDF
    This dissertation provides an empirical description of the forms and their distribution of disfluencies in spontaneous speech. Although research in this area has received much attention in past four decades, large scale analyses of speech corpora from multiple communication settings, languages, and speaker\u27s cognitive states are still lacking. Understandings of regularities of different kinds of disfluencies based on large speech samples across multiple domains are essential for both theoretical and applied purposes. As an attempt to fill this gap, this dissertation takes the approach of quantitative analysis of large corpora of spontaneous speech. The selected corpora reflect a diverse range of tasks and languages. The dissertation re-examines speech disfluency phenomena, including silent pauses, filled pauses (``um and ``uh ) and repetitions, and provides the empirical basis for future work in both theoretical and applied settings. Results from the study of silent and filled pauses indicate that a potential sociolinguistic variation can in fact be explained from the perspective of the speech planning process. The descriptive analysis of repetitions has identified a new form of repetitive phenomenon: repetitive interpolation. Both the acoustic and textual properties of repetitive interpolation have been documented through rigorous quantitative analysis. The defining features of this phenomenon can be further used in designing speech based applications such as speaker state detection. Although the goal of this descriptive analysis is not to formulate and test specific hypothesis about speech production, potential directions for future research in speech production models are proposed and evaluated. The quantitative methods employed throughout this dissertation can also be further developed into interpretable features in machine learning systems that require automatic processing of spontaneous speech

    Evaluating pause particles and their functions in natural and synthesized speech in laboratory and lecture settings

    Get PDF
    Pause-internal phonetic particles (PINTs) comprise a variety of phenomena including: phonetic-acoustic silence, inhalation and exhalation breath noises, filler particles “uh” and “um” in English, tongue clicks, and many others. These particles are omni-present in spontaneous speech, however, they are under-researched in both natural speech and synthetic speech. The present work explores the influence of PINTs in small-context recall experiments, develops a bespoke speech synthesis system that incorporates the PINTs pattern of a single speaker, and evaluates the influence of PINTs on recall for larger material lengths, namely university lectures. The benefit of PINTs on recall has been documented in natural speech in small-context laboratory settings, however, this area of research has been under-explored for synthetic speech. We devised two experiments to evaluate if PINTs have the same recall benefit for synthetic material that is found with natural material. In the first experiment, we evaluated the recollection of consecutive missing digits for a randomized 7-digit number. Results indicated that an inserted silence improved recall accuracy for digits immediately following. In the second experiment, we evaluated sentence recollection. Results indicated that sentences preceded by an inhalation breath noise were better recalled than those with no inhalation. Together, these results reveal that in single-sentence laboratory settings PINTs can improve recall for synthesized speech. The speech synthesis systems used in the small-context recall experiments did not provide much freedom in terms of controlling PINT type or location. Therefore, we endeavoured to develop bespoke speech synthesis systems. Two neural text-to-speech (TTS) systems were created: one that used PINTs annotation labels in the training data, and another that did not include any PINTs labeling in the training material. The first system allowed fine-tuned control for inserting PINTs material into the rendered material. The second system produced PINTs probabilistally. To the best of our knowledge, these are the first TTS systems to render tongue clicks. Equipped with greater control of synthesized PINTs, we returned to evaluating the recall benefit of PINTs. This time we evaluated the influence of PINTs on the recollection of key information in lectures, an ecologically valid task that focused on larger material lengths. Results indicated that key information that followed PINTs material was less likely to be recalled. We were unable to replicate the benefits of PINTs found in the small-context laboratory settings. This body of work showcases that PINTs improve recall for TTS in small-context environments just like previous work had indicated for natural speech. Additionally, we’ve provided a technological contribution via a neural TTS system that exerts finer control over PINT type and placement. Lastly, we’ve shown the importance of using material rendered by speech synthesis systems in perceptual studies.This research was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) within the project “Pause-internal phonetic particles in speech communication” (project number: 418659027; project IDs: MO 597/10-1 and TR 468/3-1). Associate member of SFB1102 “Information Density and Linguistic Encoding” (project number: 232722074)

    Who is in charge? An L2 Discourse Intonation Study on Four Prosodic Parameters to Exert the Pragmatic Function of Dominance and Control in the Context of L2 Non-specialist Public Speaking

    Get PDF
    This paper reports the findings from a study of the learning of English intonation by Spanish speakers within the discourse mode of L2 oral presentation. The purpose of this experiment is, firstly, to compare four prosodic parameters before and after an L2 discourse intonation training programme and, secondly, to confirm whether subjects, after the aforementioned L2 discourse intonation training, are able to match the form of these four prosodic parameters to the discourse-pragmatic function of dominance and control. The study designed the instructions and tasks to create the oral and written corpora and Brazil’s Pronunciation for Advanced Learners of English was adapted for the pedagogical aims of the present study. The learners’ pre- and post-tasks were acoustically analysed and a pre / post- questionnaire design was applied to interpret the acoustic analysis. Results indicate most of the subjects acquired a wider choice of the four prosodic parameters partly due to the prosodically-annotated transcripts that were developed throughout the L2 discourse intonation course. Conversely, qualitative and quantitative data reveal most subjects failed to match the forms to their appropriate pragmatic functions to express dominance and control in an L2 oral presentation
    corecore