2,197 research outputs found

    Within-Speaker Features for Native Language Recognition in the Interspeech 2016 Computational Paralinguistics Challenge

    Get PDF
    The Interspeech 2016 Native Language recognition challenge was to identify the first language of 867 speakers from their spoken English. Effectively this was an L2 accent recognition task where the L1 was one of eleven languages. The lack of transcripts of the spontaneous speech recordings meant that the currently best performing accent recognition approach (ACCDIST) developed by the author could not be applied. Instead, the objectives of this study were to explore whether within-speaker features found to be effective in ACCDIST would also have value within a contemporary GMM-based accent recognition approach. We show that while Gaussian mean supervectors provide the best performance on this task, small gains may be had by fusing the mean supervector system with a system based on within-speaker Gaussian mixture distances

    Perception of English and Polish obstruents

    Get PDF
    Praca niniejsza koncentruje się na kontraście dźwięczna-bezdźwięczna w percepcji angielskich i polskich spółgłosek właściwych. Metodologia badań oparta została na manipulacji akustycznej parametrów temporalnych i spektralnych, które biorą udział w implementacji kontrastu dźwięczności w badanych językach. Porównane zastałych trzy grupy badanych – początkujący uczący się języka angielskiego, zaawansowani użytkownicy języka angielskiego, oraz rodowici mówcy języka angielskiego. Praca składa się z dwóch części teoretycznych, ilustrujących problematykę i kontrastujących strategie implementacji kontrastu dźwięczności w badanych językach, oraz części badawczej, prezentującej zastosowaną metodologię badań oraz analizę wyników. Część pierwsza porusza problem roli percepcji mowy w badaniach językoznawczych. Dotyka takich aspektów jak brak bezpośredniej relacji między sygnałem dźwiękowym a kategorią fonologiczną, wyjątkowa plastyczność i zdolność adaptacyjna ludzkiej percepcji mowy, oraz referuje propozycje dotyczące kompleksowego opisu działania ludzkiej percepcji mowy. W kolejnych podrozdziałach praca omawia percepcję w kontekście kontaktu językowego, a więc rozróżnianie kontrastów akustycznych występujących w języku obcym, ale nieobecnych w języku pierwszym. Zostają również zrecenzowane modele, które taki proces opisują, jak i hipotezy opisujące potencjalny sukces w opanowaniu efektywnej percepcji kontrastów percepcyjnych występujących w języku obcym. Część druga koncentruje się na różnicach temporalnych i akustycznych w implementacji dźwięczności w języku angielskim i polskim. Opisane zostają aspekty takie jak; Voice Onset Time, długość samogłoski, długość zwarcia, długość frykcji, ubezdźwięcznienie, długość wybuchu. Cześć trzecia, badawcza, prezentuje materiał poddany badaniu, metodologię manipulacji materiału, oraz charakterystykę grup. Hipotezy oparte na założeniach teoretycznych są następnie weryfikowane przy pomocy otrzymanych wyników. Część końcowa omawia problemy percepcyjne, jakie spotykają Polaków uczących się języka angielskiego oraz wyciąga wnioski pedagogiczne

    Temporal and spectral parameters in perception of the voicing contrast in English and Polish

    Get PDF
    Niniejsza praca koncentruje się na czasowych i spektralnych parametrach percepcji dźwięczności w języku angielskim i polskim. Metodologia badań oparta została na akustycznej manipulacji parametrami temporalnymi i spektralnymi, które biorą udział w implementacji kontrastu dźwięczności w badanych językach. Porównane zostały trzy grupy badanych: początkujący uczący się języka angielskiego, zaawansowani użytkownicy języka angielskiego oraz rodowici użytkownicy języka angielskiego. Praca składa się z dwóch części teoretycznych, ilustrujących problematykę i zestawiających z sobą różne strategie implementacji kontrastu dźwięczności w badanych językach, oraz części badawczej, prezentującej zastosowaną metodologię badań i analizę wyników. Część pierwsza porusza problem roli percepcji mowy w badaniach językoznawczych. Dotyka takich aspektów jak brak bezpośredniej relacji między sygnałem dźwiękowym a kategorią fonologiczną, wyjątkowa plastyczność i zdolność adaptacyjna ludzkiej percepcji mowy, a także referuje propozycje dotyczące kompleksowego opisu działania ludzkiej percepcji mowy. W kolejnych podrozdziałach praca omawia percepcję w kontekście kontaktu językowego, a więc rozróżnianie kontrastów dźwiękowych występujących w języku obcym, ale nieobecnych w języku pierwszym. Zostają również zrecenzowane modele, które taki proces opisują, jak i hipotezy dotyczące potencjalnego sukcesu w opanowaniu efektywnej percepcji kontrastów dźwiękowych występujących w języku obcym. Część druga pracy koncentruje się na różnicach temporalnych i akustycznych w implementacji dźwięczności w języku angielskim i polskim. Opisane zostały aspekty, takie jak: parametr VOT, długość samogłoski, długość zwarcia, długość frykcji, ubezdźwięcznienie, długość wybuchu. Cześć trzecia, badawcza, prezentuje materiał wykorzystany podczas badania percepcji, metodologię manipulacji tym materiałem oraz charakterystykę grup osób poddanych badaniom. Hipotezy oparte na założeniach teoretycznych są następnie weryfikowane na podstawie otrzymanych wyników. Część końcowa omawia problemy percepcyjne, jakie spotykają Polaków uczących się języka angielskiego, oraz zawiera wnioski dydaktyczne

    Hakka tone training for native speakers of tonal and nontonal languages

    Get PDF
    Language learning becomes increasingly difficult when novel linguistic features are introduced. Studies have shown that learners from various language backgrounds can be trained to perceive lexical tone, which assigns meaning to words using variations in pitch. In this thesis, we investigated whether native speakers of tonal Mandarin Chinese and tonal Vietnamese outperformed native speakers of nontonal English when learning Hakka Chinese tones following five sessions of tone training, and whether the complexity (i.e., density) of a listener’s native tone inventory facilitated nonnative tone learning. All groups improved in tone identification and tone word learning following training, with improvements persisting three weeks following the cessation of training. Although both tonal groups outperformed the English group in most tasks, the Mandarin group showed the most consistent advantages over the English group across tasks. Findings suggest that tone experience bolsters tone learning, but density of the tone inventory does not provide an advantage. Confusion patterns offer detailed insight of the interaction between nonnative tones and native tonal and intonational categories

    L2 speech learning of European Portuguese /l/ and /ɾ/ by L1-Mandarin learners: experimental evidence and theoretical modelling

    Get PDF
    It has been long recognized that the poor distinction between /l/ and /ɾ/ is one of the most perceptible characteristics in Chinese-accented Portuguese. Recent empirical research revealed that this notorious L2 speech learning difficulty goes beyond the confusion between two L2 categories, as L1-Mandarin learners’ acquisition of Portuguese /l/ and /ɾ/ seems to be subject to the interaction among different prosodic positions, speech modalities and representational levels. This thesis aims to deepen our current understanding of this L2 speech learning process, by exploring what constrains the development of L2 phonological categories across syllable positions and how different modalities interact during this process. To achieve this goal, both experimental tasks and theoretical modelling were employed. The first study of this thesis explores the role of cross-linguistic influence and orthography on L2 category formation. In order to elicit cross-linguistic influence directly, a delayed-imitation task was performed with L1-Mandarin naïve listeners. This task examined how the Mandarin phonology parses the Portuguese input ([l], [ɾ]) in intervocalic onset and in word-internal coda position. Moreover, whether orthography plays a role during the construction of L2 phonological representation was tested by manipulating the input types that were given in the experiment (auditory input alone vs. auditory + written input). Our study shows that naïve Mandarin listeners’ responses corroborated with that of L1-Mandarin learners, suggesting that cross-linguistic influence is responsible for the observed L2 prosodic effects. Moreover, the Mandarin [ɻ] (a repair strategy for /ɾ/) occurred almost exclusively when the written form was given, providing evidence for the cross-linguistic interaction between phonological categorization and orthography during the construction of L2 categories. In the second study, we first investigate the interaction between speech perception and production in L2 speech learning, by examining whether the L2 deviant productions stem from misperception and whether the order of acquisition in L2 speech perception mirrors that in production. Secondly, we test whether L2 phonological categories remain malleable at a mid-late stage of L2 speech learning. Two perceptual experiments were performed to test L1-Mandarin learners on their discrimination ability between the target Portuguese form and the deviant form employed in L2 production. Expanding on prior research, in this study, the perceptual motivation for L2 speech difficulties was assessed in different syllable constituents (onset and coda) and at both segmental and suprasegmental levels (structural modification). The results demonstrate that some deviant forms observed in L2 production indeed have a perceptual motivation ([w] for the velarised lateral; [l] and [ɾə] for the tap), while some others cannot be attributed to misperception (deletion of syllable-final tap). Furthermore, learners confused the intervocalic /l/ and /ɾ/ bidirectionally in perception, while in production they never misproduced the lateral (/ɾ/ → [l], */l/ → [ɾ]), revealing a mismatch between two speech modalities. By contrast, the order of acquisition (/ɾ/coda > /ɾ/onset) was shown to be consistent in L2 perception and production. The correspondence and discrepancy between the two speech modalities signal a complex relationship between L2 speech perception and production. To assess the plasticity of L2 categories /l/ and /ɾ/, two groups of L1-Mandarin learners who differ substantially in terms of L2 experience were recruited in the perceptual tasks. Our study shows that both groups behaved similarly in terms of the discrimination performance. No evidence for a role of L2 experience was found. The implication of this null result on L2 phonological development is discussed. The third study of the thesis aims to contribute to bridging the gap between the L2 experimental evidence and formal theories. Adopting the Bidirectional Phonology and Phonetics Model, we formalise some of the experimental findings that cannot be elucidated by current L2 speech theories, namely, the between and within-subject variation in L2 phonological categorization; the interaction between phonological categorization and orthography during L2 category construction; and the asymmetry between L2 perception and production. Overall, this thesis sheds light on the complex nature of L2 phonological acquisition and provides a formal account of how different modalities interact in shaping L2 speech learning. Moreover, it puts forward testable predictions for future research and suggestions for improving foreign language teaching/training methodologies.É bem conhecido o facto de as trocas associadas a /l/ e /ɾ/ constituírem uma das caraterísticas mais percetíveis no português articulado pelos aprendentes chineses. Recentemente, estudos empíricos revelam que a dificuldade por parte dos aprendentes chineses não se restringe à discriminação moderada entre as duas categorias da L2, dado que a aquisição de /l/ e /ɾ/ do português por aprendentes chineses parece estar sujeita à interação entre contextos prosódicos, entre modalidades de fala e entre níveis representacionais diferentes. Esta tese visa aprofundar a nossa compreensão deste processo da aquisição fonológica L2, explorando o que condiciona o desenvolvimento das categorias fonológicas L2 em diferentes constituintes silábicos e de que modo as modalidades interagem durante este processo, recorrendo para tal a tarefas experimentais bem como a formalização teórica. O primeiro estudo averigua o papel da influência interlinguística e o da ortografia na construção das categorias de L2. Para elicitar a influência interlinguística diretamente, uma tarefa de imitação retardada foi aplicada aos falantes nativos do mandarim sem conhecimento de português, investigando assim como a fonologia do mandarim categoriza o input do português ([l], [ɾ]) em ataque simples intervocálico e em coda medial. Para além disso, a influência ortográfica na construção de representações fonológicas em L2 foi examinada através da manipulação do tipo do input apresentado na experiência (input auditivo vs. input auditivo + ortográfico). Os resultados da situação experimental em que os participantes receberam input de ambos os tipos replicaram o efeito prosódico observado na literatura, evidenciando a interação entre categorização fonológica e ortografia na construção das categorias de L2. No segundo estudo, investigamos a interação entre a perceção e a produção de fala na aquisição das líquidas do PE por aprendentes chineses e a plasticidade destas categorias fonológicas, respondendo às questões seguintes: 1) as produções desviantes de L2 resultam da perceção incorreta? 2) a ordem da aquisição em L2 é consistente na perceção e na produção? 3) as categorias da L2 permanecem maleáveis numa fase intermédia da aquisição? Duas tarefas percetivas foram conduzidas para testar a capacidade percetiva dos aprendentes nativos do mandarim em relação à discriminação entre a forma alvo do português e as formas desviantes utilizadas na produção. No presente estudo, a motivação percetiva das dificuldades em L2 foi testada nos constituintes silábicos diferentes (ataque simples e coda) e nos níveis segmental e suprassegmental (modificação estrutural). Os resultados demonstram que algumas formas desviantes que os aprendentes chineses produzem têm uma motivação percetiva (i.e. [w] para a lateral velarizada; [l] e [ɾə] para a vibrante alveolar), enquanto outras não podem ser analisadas como casos de perceção incorreta (como é o caso do o apagamento da vibrante em coda). Para além disso, na posição intervocálica, os aprendentes manifestam dificuldade na discriminação entre /l/ e /ɾ/ de forma bidirecional, mas, na produção, a lateral nunca é produzida incorretamente (/ɾ/ → [l], */l/ → [ɾ]). Tal revela uma divergência entre as duas modalidades de fala. Por contraste, mostrou-se que a ordem da aquisição (/ɾ/coda > /ɾ/ataque) é consistente na perceção e na produção da L2. A correspondência e a discrepância entre as duas modalidades de fala, sinalizam uma relação complexa entre a perceção e a produção na aquisição fonológica de L2. Em relação à questão da plasticidade das categorias de L2, recrutaram-se para as tarefas percetivas dois grupos de aprendentes nativos do mandarim que se diferenciavam substancialmente em termos da experiência em L2. Não se encontrou um efeito significativo da experiência da L2. A implicação deste resultado nulo no desenvolvimento fonológico de L2 foi discutida. O terceiro estudo desta tese tem como objetivo contribuir para a colmatação das lacunas entre estudos empíricos de L2 e as teorias formais. Adotando o Modelo Bidirecional de Fonologia e Fonética, formalizamos os resultados experimentais que as teorias atuais da aquisição fonológica de L2 não conseguem explicar, nomeadamente, a variação inter e intra-sujeitos na categorização fonológica em L2; a interação entre categorização fonológica e ortografia na construção das categorias na L2; a assimetria entre a perceção e a produção na L2. Em suma, esta tese contribui com dados empíricos para a discussão da relação complexa entre a perceção, produção e ortografia na aquisição fonológica de L2 e formaliza a interação entre essas modalidades através de um modelo linguístico generativo. Além disso, apresentam-se predições testáveis para investigação futura e sugestões para o aperfeiçoamento das metodologias de ensino/treino da língua não materna

    Predictions on markedness and feature resilience in loanword adaptation

    Get PDF
    Normalement, un emprunt est adapté afin que ses éléments étrangers s’intègrent au système phonologique de la langue emprunteuse. Certains auteurs (cf. Miao 2005; Steriade 2001b, 2009) ont soutenu que, lors de l’adaptation d’une consonne, les traits de manière d’articulation sont plus résistants au changement que les traits laryngaux (ex. : le voisement) ou que ceux de place. Mes résultats montrent cependant que les traits de manière (ex. : [±continu]) sont impliqués dans les adaptations consonantiques aussi fréquemment que les autres traits (ex. [±voisé] et [±antérieur]). Par exemple, le /Z/ français est illicite à l’initiale en anglais. Les options d’adaptation incluent /Z/ → [z] (changement de place), /Z/ → [S] (changement de voisement) et /Z/ → [dZ] (changement de manière). Contrairement aux prédictions des auteurs précités, l’adaptation primaire en anglais est /Z/ → [dZ], avec changement de manière (ex. français [Zelatin] gélatine → anglais [dZElœtIn]). Plutôt qu’une résistance des traits de manière, les adaptations étudiées dans ma thèse font ressortir une nette tendance à la simplification. Mon hypothèse est que les langues adaptent les consonnes étrangères en en éliminant les complexités. Donc un changement impliquant l’élimination plutôt que l’insertion d’un trait marqué sera préféré. Ma thèse innove aussi en montrant qu’une consonne est le plus souvent importée lorsque sa stratégie d’adaptation primaire implique l’insertion d’un trait marqué. Les taux d’importation sont systématiquement élevés pour les consonnes dont l’adaptation impliquerait l’insertion d’un tel trait (ici [+continu] ou [+voisé]). Par exemple, /dZ/ en anglais, lorsque adapté, devient /Z/ en français après l’insertion de [+continu]; cependant, l’importation de /dZ/ est de loin préférée à son adaptation (89%). En comparaison, /dZ/ est rarement importé (10%) en germano-pennsylvanien (GP) parce que l’adaptation de /dZ/ à [tS] (élision du trait marqué [+voisé]) est disponible, contrairement au cas du français. Cependant, le /t/ anglais à l’initiale, lui, est majoritairement importé (74%) en GP parce que son adaptation en /d/ impliquerait l’insertion du trait marqué [+voisé]. Ma thèse permet non seulement de mieux cerner la direction des adaptations, mais repère aussi ce qui favorise fortement les importations sur la base d’une notion déjà établie en phonologie : la marque.A loanword is normally adapted to fit its foreign elements to the phonological system of the borrowing language (L1). Recently, some authors (e.g. Miao 2005; Steriade 2001b, 2009) have proposed that during the adaptation process of a second language (L2) consonant, manner features are more resistant to change than are non-manner features. A careful study of my data indicate that manner features (e.g. [±continuant]) are as likely to be involved in the adaptation process as are non-manner [±voice] and [±anterior]. For example, French /Z/ is usually not tolerated word-initially in English. Adaptation options include /Z/ → [z] (change of place), /Z/ → [S] (change of voicing) and /Z/ → [dZ] (change of manner). The primary adaptation in English is /Z/ → [dZ] (e.g. French [Zelatin] gélatine → English [dZElœtIn]) where manner is in fact the less resistant. Instead, during loanword adaptation there is a clear tendency towards unmarkedness. My hypothesis is that languages overwhelmingly adapt with the goal of eliminating the complexities of the L2; a change that involves deletion instead of insertion of a marked feature is preferred. Furthermore, my thesis shows for the first time that a consonant is statistically most likely to be imported if its preferred adaptation strategy involves insertion of a marked feature (e.g. [+continuant] or [+voice]). For example, the adaptation of English /dZ/ is /Z/ in French after insertion of marked [+continuant], but /dZ/ is overwhelmingly imported (89%), instead of adapted in French. I argue that this is to avoid the insertion of marked [+continuant]. This contrasts with Pennsylvania German (PG) where English /dZ/ is rarely imported (10%). This is because unlike in French, there is an option to adapt /dZ/ to /tS/ (deletion of marked [+voice]) in PG. However, English word-initial /t/ is heavily imported (74%), not adapted, in PG because adaptation to /d/ involves insertion of marked [+voice]. Not only does my thesis better determine the direction of adaptations but it also establishes the circumstances where L2 consonants are most likely to be imported instead of being adapted, on the basis of a well-known notion in phonology: markedness

    Word hypothesis of phonetic strings using hidden Markov models

    Get PDF
    This thesis investigates a stochastic modeling approach to word hypothesis of phonetic strings for a speaker independent, large vocabulary, continuous speech recognition system. The stochastic modeling technique used is Hidden Markov Modeling. Hidden Markov Models (HMM) are probabilistic modeling tools most often used to analyze complex systems. This thesis is part of a speaker independent, large vocabulary, continuous speech understanding system under development at the Rochester Institute of Technology Research Corporation. The system is primarily data-driven and is void of complex control structures such as the blackboard approach used in many expert systems. The software modules used to implement the HMM were created in COMMON LISP on a Texas Instruments Explorer II workstation. The HMM was initially tested on a digit lexicon and then scaled up to a U.S. Air Force cockpit lexicon. A sensitivity analysis was conducted using varying error rates. The results are discussed and a comparison with Dynamic Time Warping results is made

    Loanwords in Context: Lexical Borrowing from English to Japanese and its Effects on Second-Language Vocabulary Acquisition

    Get PDF
    Research has shown that cognates between Japanese and English have the potential to be a valuable learning tool (Daulton, 2008). Yet little is known on how Japanese learners of English produce cognates in context. Recently, studies have argued that cognates can cause a surprisingly high number of syntactic errors in sentence writing activities with Japanese learners (Rogers, Webb, & Nakata, 2014; Masson, 2013). In the present study, I investigated how Japanese learners of English understood and used true cognates (words that have equivalent meanings in both languages) and non-true cognates (words where the Japanese meaning differs in various ways from their English source words). Via quasi-replication, I analyzed participants\u27 sentences to determine the interaction of true and non-true cognates on semantics and syntax. In an experimental study, twenty Japanese exchange students filled out a word knowledge scale of thirty target words (half true cognates and half non-true cognates) and wrote sentences for the words they indicated they knew. These sentences were analyzed quantitatively and qualitatively for both semantic and syntactic errors. Sentences with true cognates were semantically accurate 86% of the time, while those with non-true cognates were accurate only 62.3% of the time, which was a statistically significant difference. When the sentences were analyzed for syntax, there was no statistically significant difference in the number of errors between true and non-true cognates, which contrasts with previous research. Qualitative analysis revealed that the most problematic syntactic issue across both cognate types was using collocations correctly. Among those collocational issues, there were clear differences in the types of errors between true and non-true cognates. True cognate target words were more likely to lead to problems with prepositional collocations, while non-true cognate target words were more likely to lead to problems with verb collocations. These results suggest that for intermediate Japanese learners of English, semantics of non-true cognates should be prioritized in learning, followed by syntax of true and non-true cognates, which should be taught according to the most problematic error types per cognate status
    corecore