1,612 research outputs found

    “A better me”: Using acoustically modified learner voices as models

    Get PDF
    This paper presents the results of a brief mixed-methods intervention which sought to modify the production of prominence-related features in L2 English by four native French-speaking university lecturers, in read-aloud speech. Selected parts of participants’ productions were acoustically modified and then used as the model in a Listen-and-Repeat protocol, where both quantitative (acoustic measures) and qualitative (free comments from discussion) data were collected. Acoustic measures were taken again from productions realized three months after the protocol, to trace longer term retention of modifications; expert listeners compared a selection of these productions to the original, diagnostic renditions, rating the degree of native-like rhythm and melody. Analysis of the quantitative and qualitative results confirm that imitating oneself can help individuals to modify prominence-related features of their pronunciation, that such changes can be retained over a 3-month period, but that people cannot reliably judge what they have modified. New potential is thus shown for Listen-and-Repeat, using one’s own modified voice, as an effective technique in pronunciation instruction

    Developing Sparse Representations for Anchor-Based Voice Conversion

    Get PDF
    Voice conversion is the task of transforming speech from one speaker to sound as if it was produced by another speaker, changing the identity while retaining the linguistic content. There are many methods for performing voice conversion, but oftentimes these methods have onerous training requirements or fail in instances where one speaker has a nonnative accent. To address these issues, this dissertation presents and evaluates a novel “anchor-based” representation of speech that separates speaker content from speaker identity by modeling how speakers form English phonemes. We call the proposed method Sparse, Anchor-Based Representation of Speech (SABR), and explore methods for optimizing the parameters of this model in native-to-native and native-to-nonnative voice conversion contexts. We begin the dissertation by demonstrating how sparse coding in combination with a compact, phoneme-based dictionary can be used to separate speaker identity from content in objective and subjective tests. The formulation of the representation then presents several research questions. First, we propose a method for improving the synthesis quality by using the sparse coding residual in combination with a frequency warping algorithm to convert the residual from the source to target speaker’s space, and add it to the target speaker’s estimated spectrum. Experimentally, we find that synthesis quality is significantly improved via this transform. Second, we propose and evaluate two methods for selecting and optimizing SABR anchors in native-to-native and native-to-nonnative voice conversion. We find that synthesis quality is significantly improved by the proposed methods, especially in native-to- nonnative voice conversion over baseline algorithms. In a detailed analysis of the algorithms, we find they focus on phonemes that are difficult for nonnative speakers of English or naturally have multiple acoustic states. Following this, we examine methods for adding in temporal constraints to SABR via the Fused Lasso. The proposed method significantly reduces the inter-frame variance in the sparse codes over other methods that incorporate temporal features into sparse coding representations. Finally, in a case study, we examine the use of the SABR methods and optimizations in the context of a computer aided pronunciation training system for building “Golden Speakers”, or ideal models for nonnative speakers of a second language to learn correct pronunciation. Under the hypothesis that the optimal “Golden Speaker” was the learner’s voice, synthesized with a native accent, we used SABR to build voice models for nonnative speakers and evaluated the resulting synthesis in terms of quality, identity, and accentedness. We found that even when deployed in the field, the SABR method generated synthesis with low accentedness and similar acoustic identity to the target speaker, validating the use of the method for building “golden speakers”

    The Effect of Shadowing in Learning L2 Segments: A Perspective from Phonetic Convergence

    Get PDF
    This study aimed to investigate the role that phonetic convergence plays in the acquisition of L2 segments. In particular, it examined whether phonetic convergence towards native speakers could help Arabic-speaking second-language (L2) learners of English improve their pronunciation of four problematic English segments (/p, v, ɛ, oʊ/). To do so, the study went through several phases of experimental studies. Phonetic convergence was first explored in the productions of Arabic L2 learners towards five different English native model talkers in non-interactive setting. Five XAB perceptual similarity judgments and acoustic measurements of VOT, vowel duration, F0, and F1*F2 were used to evaluate phonetic convergence.Based mainly on perceptual measures of phonetic convergence, learners were divided evenly between two groups. C-group (convergence group) received phonetic production training from the model talkers to whom they showed the highest degree of phonetic convergence, while D-group (divergence group) received training from the model talkers they showed divergence from or the least convergence to. Training lasted three consecutive days with target segments (i.e., /p, v, ɛ, oʊ/) presented in nonsense words. They were trained using the shadowing technique that used low-variability training paradigm in which each learner received training from one native model talker. Native-speaker judgments on segmental intelligibility indicated both groups showed significant improvement on the post-test; however, no significant differences were found between groups in terms of the overall magnitude of this change. Perceived convergence in learners’ speech failed to explain the improvement. However, some patterns of acoustic convergence towards their trainers, regardless of group, predicted the overall segmental intelligibility gains. The findings suggested that the more trainees converged their vowel duration and formants to their trainers, the more their performance improved. At featural level, the study examined the relationship between the preexisting phonetic distance between the Arabic L2 learners of English and model talkers before the exposure and the degree of convergence. Results indicated that there was a direct relationship between how far Arabic L2 learners were from the native model talkers and the degree of convergence in all measured acoustic features. That is, the greater the baseline distance, the greater the degree of phonetic convergence was. However, such a relationship might be due to the metric used to assess phonetic convergence. The relationship between phonetic convergence measured by difference in distance (DID) and the absolute baseline distance is always biased due to the way they are calculated (Cohen Priva & Sanker, 2019; MacLeod, 2021). This study found shadowing to be an effective technique to promote segmental intelligibility among Arabic-speakers learning English as an L2. However, this effectiveness might be increased by trainees converging more to their trainers in vowel duration and vowel spectra or being similar to their trainers in this regard from the beginning

    The effect of teaching prosody awareness on interpreting performance: an experimental study of consecutive interpreting from English into Farsi

    Get PDF
    This study investigates the effect of prosodic feature awareness training on the quality of interpreting by interpreter trainees. Two groups of student interpreters were formed. Participants were assigned to groups at random, but with equal division between genders (seven males in each group). The control group was then taught interpreting skills by the routine curriculum, while the experimental group spent part of the time instead on theoretical explanation and practical exercises emphasizing prosodic differences between Farsi and English. Three raters assessed the quality of the interpreter trainees’ performance in a post-test in terms of accuracy, omissions, overall coherence, grammar, expression, word choice, terminology, accentedness, pace and voice. The results show that prosodic feature awareness training did have a statistically significant effect on the quality measures: the overall assessment of the experimental group was 14 pointsbetter (on a scale between 0 and 100) than that of the control group. Moreover, the difference was larger for the phonetic/prosodic quality scales (accentedness, pace, voice) than for the other scales. These results have implications for designers of curricula for training interpreters, material producers and all who are involved in foreign-language study and pedagogy.Theoretical and Experimental Linguistic

    The effects of English proficiency on the processing of Bulgarian-accented English by Bulgarian-English bilinguals

    Get PDF
    This dissertation explores the potential benefit of listening to and with one’s first-language accent, as suggested by the Interspeech Intelligibility Benefit Hypothesis (ISIB). Previous studies have not consistently supported this hypothesis. According to major second language learning theories, the listener’s second language proficiency determines the extent to which the listener relies on their first language phonetics. Hence, this thesis provides a novel approach by focusing on the role of English proficiency in the understanding of Bulgarian-accented English for Bulgarian-English bilinguals. The first experiment investigated whether evoking the listeners’ L1 Bulgarian phonetics would improve the speed of processing Bulgarian-accented English words, compared to Standard British English words, and vice versa. Listeners with lower English proficiency processed Bulgarian-accented English faster than SBE, while high proficiency listeners tended to have an advantage with SBE over Bulgarian accent. The second experiment measured the accuracy and reaction times (RT) in a lexical decision task with single-word stimuli produced by two L1 English speakers and two Bulgarian-English bilinguals. Listeners with high proficiency in English responded slower and less accurately to Bulgarian-accented speech compared to L1 English speech and compared to lower proficiency listeners. These accent preferences were also supported by the listener’s RT adaptation across the first experimental block. A follow-up investigation compared the results of L1 UK English listeners to the bilingual listeners with the highest proficiency in English. The L1 English listeners and the bilinguals processed both accents with similar speed, accuracy and adaptation patterns, showing no advantage or disadvantage for the bilinguals. These studies support existing models of second language phonetics. Higher proficiency in L2 is associated with lesser reliance on L1 phonetics during speech processing. In addition, the listeners with the highest English proficiency had no advantage when understanding Bulgarian-accented English compared to L1 English listeners, contrary to ISIB. Keywords: Bulgarian-English bilinguals, bilingual speech processing, L2 phonetic development, lexical decision, proficienc

    Teaching academic writing to Iraqi undergraduate students: An investigation into the effectiveness of a genre-process approach

    Get PDF
    A modified integrated process-genre approach (MIM) was implemented with EFL undergraduate students in Iraq. Some students subject to the MIM were better able to construct structurally complex and reasonably-grounded arguments and to employ a wider range of informal reasoning patterns group.Combining the merits of both the process and genre approaches has the potential to develop a more coherent model of writing by taking into account cognitive and social demands

    Adverb + adjective collocations in a spoken learner corpus: A quantitative and qualitative approach

    Get PDF
    Negli ultimi 70 anni, c'è stato un incremento degli studi e ricerche inglesi sulle collocazioni (Firth 1957; Hoey, 2005; Moon, 1998b; Sinclair 1991; 2004; Stubbs, 1996; 2001), i quali hanno evidenziato che la fraseologia è pervasiva alla lingua (Altenberg, 1998; Biber et al., 1999; Cowie, 1991; 1992; Howarth; 1998). Questo indica anche che una buona padronanza delle collocazioni è necessaria se i discenti mirano a raggiungere una fluidità simile a quella di un nativo nella L2. Infatti, la ricerca sulla produzione di linguaggio formulaico da parte degli apprendenti ha dimostrato che le collocazioni sono essenziali nell'acquisizione della lingua seconda (Cowie, 1998; Pawley & Syder, 1983; Peters, 1983) e sono una componente chiave per lo sviluppo della "fluency" (Ellis, 2002; 2003; Ellis et al., 2015; Howarth, 1998). Nonostante il maggior numero di studi sulle collocazioni, la maggior parte degli studiosi si è concentrata su dati scritti e su un insieme ristretto di combinazioni, come le collocazioni verbo + sostantivo. La scarsa disponibilità di corpora orali di discenti e la maggiore attenzione per le sequenze formulaiche più soggette a errori hanno portato i ricercatori a trascurare collocazioni come avverbio + aggettivo. L'intensificazione è una parte intricata dell'apprendimento delle lingue straniere (Lorenz, 1999) e merita ulteriore attenzione, soprattutto per quanto riguarda i dati parlati, che riflettono meglio il linguaggio dei discenti (Myles, 2015). Il presente lavoro indaga le collocazioni di avverbi + aggettivi in un corpus parlato di recente compilazione di studenti italiani avanzati di inglese L2. La tesi adotta un approccio di Analisi Interlinguistica Contrastiva (Granger, 1998) per verificare se: a) ci sono differenze tra la produzione di collocazioni degli studenti italiani di inglese rispetto ai coetanei madrelingua; b) ci sono differenze tra le collocazioni prodotti dagli studenti italiani e quelle dei madrelingua in termini di modelli sintattici e significato lessicale; c) la congruenza della L1 ha un effetto di trasferimento sulla produzione da parte dei discenti di collocazioni poco frequenti e/o non attestate. Per rispondere alle tre domande di ricerca, sono state condotte analisi quantitative e qualitative sull'Italian Spoken Learner Corpus (ISLC) e sul corpus gemello di LINDSEI, LOCNEC. LOCNEC è stato utilizzato come corpus di riferimento di madrelingua per il suo alto livello di comparabilità con ISLC. Per le analisi quantitative, è stato seguito l'approccio di Durrant e Schmitt (2009) per il calcolo dei punteggi delle misure di associazione delle collocazioni (t-score e MI) sulla base del corpus di riferimento BNC e le collocazioni sono state poi divise in tre categorie in base al loro punteggio: collocazioni (t-score e MI maggiore o uguale a 2 e 3 rispettivamente), collocazioni infrequenti/non attestate (t-score e MI non disponibili a causa dell'infrequenza), collocazioni in area grigia (t-score e MI inferiore a 2 e 3 rispettivamente). I test T-test e Wilcoxon rank sum test sono stati utilizzati sulle collocazioni estratte da ISLC e LOCNEC e sono state calcolate le dimensioni degli effetti. Inoltre, i test sono stati impiegati per valutare i valori medi individuali di t-score e MI degli studenti e dei madrelingua. Per quanto riguarda le analisi qualitative, è stato impiegato uno schema a tre livelli per analizzare due serie di collocazioni: la prima serie comprende 11 collocazioni con t-score e MI maggiore uguale a 2 e 3 rispettivamente e una frequenza di 5 nell'ISLC; la seconda serie comprende 9 collocazioni infrequenti/non attestate con una frequenza maggiore o uguale a 2 nell'ISLC. Seguendo lo schema, i due set di collocazioni estratti sia dall'ISLC sia dal LOCNEC sono stati analizzati tenendo conto del loro background collocativo (etimologia, livello CEFR, congruenza L1), delle variabili del discente (sesso, esperienza di soggiorno all'estero, corso universitario, altre lingue), e delle variabili testuali (funzione attributiva vs predicativa dell'aggettivo, pronomi vs it-sentences, tempo verbale, affermativo vs negativo, connotazione positiva vs negativa). I risultati dei test statistici sono stati tutti significativi con effect size medio-grandi e, insieme alle analisi qualitative, hanno indicato che: gli studenti italiani di inglese producono un minor numero di collocazioni; un maggior numero di non-collocazioni; le loro combinazioni sono meno collocative di quelle dei madrelingua (ovvero, i loro punteggi di misura delle associazioni sono in media più bassi di quelli dei nativi); non ci sono differenze marcate in termini di modelli lessico-grammaticali tra le collocazioni degli studenti e quelle dei madrelingua, ma gli studenti tendono ad assegnare alle loro collocazioni funzioni più creative dal punto di vista pragmatico; non è stata trovata alcuna prova di trasferimento L1 (negativo) in relazione alla produzione da parte dei discenti di collocazioni infrequenti/non attestate, sostenendo così ulteriormente la conclusione precedente. I risultati corroborano ulteriormente la letteratura sulle collocazioni prodotte dai discenti e aggiungono un altro tassello al puzzle della lingua parlata: il ritardo collocazionale, cioè lo sviluppo più lento delle prestazioni di produzione di collocazioni, può essere trovato anche nei dati parlati e i discenti sembrano anche produrre meno collocazioni identificate da punteggio t-score. Questo ha due importanti, anche se semplici, implicazioni: che gli studenti dovrebbero probabilmente essere esposti a più input di lingua parlata, e che le teorie di acquisizione della lingua seconda potrebbero utilmente rivedere i processi di acquisizione fraseologica degli studenti nel contesto EFL. Un'altra scoperta è relativa ai modelli lessico-grammaticali delle collocazioni degli studenti non erano marcatamente diversi da quelli dei madrelingua, ma erano meno vari e mostravano una creatività pragmatica. Questo potrebbe informare gli studiosi sui potenziali processi di fossilizzazione (Selinker, 1972) nella fraseologia e/o sulle strategie di semplificazione o di evitamento (Farghal & Obiedat, 1995). Infine, anche se gli studi tradizionali hanno trovato che la congruenza L1 gioca un ruolo chiave nella produzione di collocazioni (cfr. Bahns, 1993; Granger, 1998b; Nesselhauf, 2005; Wang, 2016), nessuna prova di congruenza L1 è stata trovata per quanto riguarda i dati parlati, il che è un'interessante controprova. Nel complesso, questa tesi ha sottolineato che la produzione di collocazioni, sia quantitativamente sia pragmaticamente, è in ritardo rispetto alla competenza collocazionale e, sebbene questa linea possa essere molto sottile e non significativa nei testi scritti, il divario si allarga nella lingua parlata.In the last 70 years, there has been an increase in English studies on collocations (Firth 1957; Hoey, 2005; Moon, 1998; Sinclair 1991; 2004; Stubbs, 1996; 2001) and research which have documented that phraseology is pervasive to language (Altenberg, 1998; Biber et al., 1999; Cowie, 1991; 1992; Howarth; 1998). This also indicates that a good command of collocations is needed if learners aim to achieve native-like fluency in the L2. Indeed, research on learner production of formulaic language has shown that collocations are essential in the acquisition of second language (Cowie, 1998; Pawley & Syder, 1983; Peters, 1983) and are a key component for the development of fluency (Ellis, 2002; 2003; Ellis et al., 2015; Howarth, 1998). Despite the surge in studies on collocations, the majority of scholars have focused on written data and on a restricted set of combinations, such as verb + noun collocations. The poor availability of spoken learner corpora and the more error-prone formulaic sequences have led researchers to neglect collocations such as adverb + adjective. Intensification is an intricate part of foreign language learning (Lorenz, 1999) and deserves further attention, especially as regards spoken data, which is a better reflection of learner language (Myles, 2015). The present work investigates adverb + adjective collocations in a newly compiled spoken learner corpus of advanced Italian learners of English L2. The thesis adopts a Contrastive Interlanguage Analysis (Granger, 1998) approach to verify whether: a) there are any differences between the collocation production of Italian learners of English compared to native-speaker peers; b) there are any differences between the Italian learners’ collocations and the native speakers’ in terms of syntactic patterns and lexical meaning; c) L1 congruency has a transfer effect on the learner production of infrequent and/or unattested collocations. In order to address the three overarching research questions, quantitative and qualitative analyses were carried out on the Italian Spoken Learner Corpus (ISLC) and the sister corpus of LINDSEI, LOCNEC. LOCNEC was used as the native-speaker reference corpus for its high level of comparability to ISLC. For the quantitative analyses, Durrant and Schmitt’s (2009) approach was followed for the calculation of the collocation’s association measure scores (t-score and MI) based on the large reference corpus BNC and the collocations were then divided into three categories based on their score: collocations (t-score and MI equal or greater than 2 and 3 respectively), infrequent/unattested collocations (t-score and MI scores unavailable due to infrequency), grey area collocations (t-score and MI lower than 2 and 3 respectively). T-tests and Wilcoxon rank sum tests were computed on the collocations extracted from ISLC and LOCNEC and effect sizes were calculated. In addition, the tests were employed to assess the average individual t-score and MI values of learners and native speakers. As regards the qualitative analyses, a three-fold scheme was employed to analyse two sets of collocations: the first set comprises 11 collocations with t-score and MI equal or greater than 2 and 3 respectively and a frequency of equal or greater than 5 in the ISLC; the second set includes 9 infrequent/unattested collocations with a frequency equal or greater than 2 in ISLC. Following the scheme, the two sets of collocations extracted from both ISLC and LOCNEC were analysed by taking into account their collocational background (etymology, CEFR level, L1 congruence), the learner variables (gender, stay-abroad experience, university course, other languages), and the text variables (attributive vs predicative function of the adjective, pronouns vs it-sentences, tense, affirmative vs negative, positive vs negative connotation). The results of the statistical tests were all significant with medium to large effect sizes and, together with the qualitative analyses, indicated that: Italian learners of English produce a fewer number of collocations; a higher number of non-collocations; their combinations are less collocational than native speakers’ (i.e., their association measure scores as on average lower than the natives’); there are no marked differences in terms of lexico-grammatical patterns between the learners’ collocations and the native speakers’, but the learners tend to assign more pragmatically creative functions to their collocations; no evidence of L1 (negative) transfer was found in relation to the learners’ production of infrequent/unattested collocations, thus further supporting the previous finding. The findings further corroborate the literature on learners’ collocations and add another piece to the puzzle of spoken language: collocational lag, that is the slower development of collocation performance, can also be found in spoken data and learners also seem to produce fewer t-score collocations. This has two important, though simple, implications: that learners should probably be exposed to more spoken language input, and that second language acquisition theories might usefully review phraseological acquisition processes of EFL learners. Another finding is that the lexico-grammatical patterns of learners’ collocations were not markedly different from native speakers’, but they were less varied and displayed pragmatic creativity. This could inform scholars about potential fossilisation processes (Selinker, 1972) in phraseology and/or simplification or avoidance strategies (Farghal & Obiedat, 1995). Lastly, although mainstream studies have found that L1 congruency plays a role in the production of collocations (cf. Bahns, 1993; Granger, 1998b; Nesselhauf, 2005; Wang, 2016), no evidence of L1 congruency was found as regards spoken data, which is an interesting counter-finding. Overall, this thesis has underlined that collocation production, both quantitatively and pragmatically, lags behind collocation competence and, although this line may be very thin and not significant in written texts, the gap widens in spoken language

    Social and Psychological Factors in Bilingual Speech Production

    Get PDF
    Studies in the fields of bilingualism and second language acquisition have shown that both cognitive and affective psychological factors can influence individuals’ bilingual speech production. More recently, both experimental and variationist studies of bilingual communities have examined the role of social factors on bilinguals’ speech, particularly in cases of long-term language contact and minority-language bilingualism. The Special Issue brings together work on the psychological and/or social factors that influence bilingual speech production as well as work that uses different methodological frameworks. We examine the role of such factors on bilingual speech production in diverse contexts, in order to provide a more holistic account of the ways in which extra-linguistic influences may affect bilinguals’ speech in one or both of their languages
    • …
    corecore