11 research outputs found

    L1-L2 Interference: The case of final devoicing of French voiced fricatives in final position by German learners

    Get PDF
    International audienceThis work is dealing with a case of L1-L2 interference in language learning. The Germans learning French as a second language frequently produce unvoiced fricatives in word-final position instead of the expected voiced fricatives. We investigated the production of French fricatives for 16 non-native (8 beginner-and 8 advanced-learners) and 8 native speakers, and designed auditory feedback to help them realize the right voicing feature. The productions of all speakers were categorized either as voiced or unvoiced by experts. The same fricatives were also evaluated by non-experts in a perception experiment targeting VCs. We compare the ratings by experts and non-experts with the feature-based analysis. The ratio of locally unvoiced frames in the consonantal segment and also the ratio between consonantal duration and V1 duration were measured. The acoustic cues of neighboring sounds and pitch-based features play a significant role in the voicing judgment. As expected, we found that beginners face more difficulties to produce voiced fricatives than advanced learners. Also, the production becomes easier for the learners, especially for the beginners, if they practice repetition after a native speaker. We use these findings to design and develop feedback via speech analysis/synthesis technique TD-PSOLA using the learner's own voice

    My English sounds better than yours: Second-language learners perceive their own accent as better than that of their peers

    Get PDF
    Second language (L2) learners are often aware of the typical pronunciation errors that speakers of their native language make, yet often persist in making these errors themselves. We hypothesised that L2 learners may perceive their own accent as closer to the target language than the accent of other learners, due to frequent exposure to their own productions. This was tested by recording 24 female native speakers of German producing 60 sentences. The same participants later rated these recordings for accentedness. Importantly, the recordings had been altered to sound male so that participants were unaware of their own productions in the to-be-rated samples. We found evidence supporting our hypothesis: participants rated their own altered voice, which they did not recognize as their own, as being closer to a native speaker than that of other learners. This finding suggests that objective feedback may be crucial in fostering L2 acquisition and reduce fossilization of erroneous patterns

    Syllable, accent, rhythm: typological and methodological considerations for teaching Spanish as a foreign language

    Get PDF
    El ritmo es una propiedad del habla relacionada con la organizaciĂłntemporal de los sonidos en tĂ©rminos de agrupamiento. Las unidades desegmentaciĂłn son especĂ­ficas de cada lengua y emergen de propiedades fonolĂłgicastales como la estructura silĂĄbica, la fonotĂĄctica, y los contrastes prosĂłdicos en losniveles lĂ©xico y postlĂ©xico.Las diferencias rĂ­tmicas entre las lenguas plantean problemas para laadquisiciĂłn de segundas lenguas, debido a la compleja combinaciĂłn de clavesacĂșsticas, las dificultades perceptivas causadas por la sordera fonolĂłgica, y lasinterferencias con la organizaciĂłn de los contrastes segmentales y con el accesoal lĂ©xico.Este artĂ­culo ofrece una comparaciĂłn tipolĂłgica que incluye descripcionesde las distintas clases de ritmo existentes (temporizaciĂłn silĂĄbica, acentual, ymoraica), asĂ­ como de los distintos sistemas de contraste prosĂłdico en el nivellĂ©xico (lenguas tonales, de acento tonal, y de acento tipo «stress»). Este anĂĄlisispermite predecir los errores tĂ­picos que afectan a los estudiantes de español dedistintas procedencias lingĂŒĂ­sticas.AdemĂĄs, se describe la estructura silĂĄbica y el sistema acentual del español, yse sugieren algunas estrategias para practicarlos en clase

    Syllable, accent, rhythm: typological and methodological considerations for teaching Spanish as a foreign language

    Get PDF
    Rhythm is a speech property related to the temporal organization of sounds in terms of grouping. Segmentation units are language-specific and emerge from phonological properties such as syllable structure, phonotactics, and prosodic contrasts at the lexical and postlexical level.Rhythmic differences across languages pose problems for second language acquisition, given the intricate combination of acoustic cues, the perceptual difficulties caused by phonological deafness, and the interferences with the organization of segmental contrasts and with lexical access.This paper provides a typological comparison that includes descriptions of the attested rhythm classes (syllable-timed, stress-timed, and mora-timed), as well as of word prosody systems (tone, pitch-accent, and stress languages). This analysis yields predictions regarding typical errors for learners of Spanish from different linguistic backgrounds.Additionally, Spanish syllable structure and stress system are described and some strategies are suggested to practice these in class.El ritmo es una propiedad del habla relacionada con la organizaciĂłn temporal de los sonidos en tĂ©rminos de agrupamiento. Las unidades de segmentaciĂłn son especĂ­ficas de cada lengua y emergen de propiedades fonolĂłgicas tales como la estructura silĂĄbica, la fonotĂĄctica, y los contrastes prosĂłdicos en los niveles lĂ©xico y postlĂ©xico. Las diferencias rĂ­tmicas entre las lenguas plantean problemas para la adquisiciĂłn de segundas lenguas, debido a la compleja combinaciĂłn de claves acĂșsticas, las dificultades perceptivas causadas por la sordera fonolĂłgica, y las interferencias con la organizaciĂłn de los contrastes segmentales y con el acceso al lĂ©xico. Este artĂ­culo ofrece una comparaciĂłn tipolĂłgica que incluye descripciones de las distintas clases de ritmo existentes (temporizaciĂłn silĂĄbica, acentual, y moraica), asĂ­ como de los distintos sistemas de contraste prosĂłdico en el nivel lĂ©xico (lenguas tonales, de acento tonal, y de acento tipo «stress»). Este anĂĄlisis permite predecir los errores tĂ­picos que afectan a los estudiantes de español de distintas procedencias lingĂŒ.sticas. AdemĂĄs, se describe la estructura silĂĄbica y el sistema acentual del español, y se sugieren algunas estrategias para practicarlos en clase

    Italian speakers learn lexical stress of German morphologically complex words

    No full text
    Italian speakers tend to stress the second component of German morphologically complex words such as compounds and prefix verbs even if the first component is lexically stressed. To improve their prosodic phrasing an automatic pronunciation teaching method was developed based on auditory feedback of prosodically corrected utterances in the learners’ own voices. Basically, the method copies contours of F0, local speech rate, and intensity from reference utterances of a German native speaker to the learners’ speech signals. It also adds emphasis to the stress position in order to help the learners better recognise the correct pronunciation and identify their errors. A perception test with German native speakers revealed that manipulated utterances significantly better reflect lexical stress than the corresponding original utterances. Thus, two groups of Italian learners of German were provided with different feedback during a training session, one group with manipulated utterances in their individual voices and the other with correctly pronounced original utterances in the teacher’s voice. Afterwards, both groups produced the same sentences again and German native speakers judged the resulting utterances. Resynthesised stimuli, especially with emphasised stress, were found to be a more effective feedback than natural stimuli to learn the correct stress position. Since resynthesis was obtained without previous segmentation of the learners’ speech signals, this technology could be effectively included in Computer Assisted Language Learning software

    Developing Sparse Representations for Anchor-Based Voice Conversion

    Get PDF
    Voice conversion is the task of transforming speech from one speaker to sound as if it was produced by another speaker, changing the identity while retaining the linguistic content. There are many methods for performing voice conversion, but oftentimes these methods have onerous training requirements or fail in instances where one speaker has a nonnative accent. To address these issues, this dissertation presents and evaluates a novel “anchor-based” representation of speech that separates speaker content from speaker identity by modeling how speakers form English phonemes. We call the proposed method Sparse, Anchor-Based Representation of Speech (SABR), and explore methods for optimizing the parameters of this model in native-to-native and native-to-nonnative voice conversion contexts. We begin the dissertation by demonstrating how sparse coding in combination with a compact, phoneme-based dictionary can be used to separate speaker identity from content in objective and subjective tests. The formulation of the representation then presents several research questions. First, we propose a method for improving the synthesis quality by using the sparse coding residual in combination with a frequency warping algorithm to convert the residual from the source to target speaker’s space, and add it to the target speaker’s estimated spectrum. Experimentally, we find that synthesis quality is significantly improved via this transform. Second, we propose and evaluate two methods for selecting and optimizing SABR anchors in native-to-native and native-to-nonnative voice conversion. We find that synthesis quality is significantly improved by the proposed methods, especially in native-to- nonnative voice conversion over baseline algorithms. In a detailed analysis of the algorithms, we find they focus on phonemes that are difficult for nonnative speakers of English or naturally have multiple acoustic states. Following this, we examine methods for adding in temporal constraints to SABR via the Fused Lasso. The proposed method significantly reduces the inter-frame variance in the sparse codes over other methods that incorporate temporal features into sparse coding representations. Finally, in a case study, we examine the use of the SABR methods and optimizations in the context of a computer aided pronunciation training system for building “Golden Speakers”, or ideal models for nonnative speakers of a second language to learn correct pronunciation. Under the hypothesis that the optimal “Golden Speaker” was the learner’s voice, synthesized with a native accent, we used SABR to build voice models for nonnative speakers and evaluated the resulting synthesis in terms of quality, identity, and accentedness. We found that even when deployed in the field, the SABR method generated synthesis with low accentedness and similar acoustic identity to the target speaker, validating the use of the method for building “golden speakers”

    The role of explicit knowledge and experience with accent in the acquisition of second language sounds

    Get PDF

    MULTIMEDIALITÀ WEB E DIDATTICA DELLA PROSODIA

    Get PDF
    La presenza del Web e l’introduzione di tecnologie informatiche di tipo immersivo e collaborativo nella didattica delle lingue straniere servono a ricreare un ambiente cognitivo tipico di un contesto extra-scolastico, nel tentativo di superare quel gap esistente tra apprendimento formale e apprendimento spontaneo. E’ importante prestare attenzione alle ‘modalità’ con cui le tecnologie contribuiscono a creare un ambiente di apprendimento, in cui gli aspetti cognitivi e sociali dell’interazione uomo-macchina si intrecciano formando un unico contesto operativo. In questa prospettiva la multimedialità avvolgente del Web opera una sensibilizzazione diversa delle varie parti della macchina cerebrale modificandone lentamente procedure di percezione e strategie cognitive, quasi fosse lo stimolo costante da parte di un habitat naturaliter multimediale. Alla base di questa prospettiva c’ù una relazione stretta tra grammatiche dei media, funzionamento delle sensorialità umane e schemi cognitivi che le influenzano. L’obiettivo del progetto di ricerca ù quello di sperimentarne la ricaduta nella didattica della lingua seconda/straniera (L2/LS) ad adulti, in particolare dell’italiano come lingua straniera, limitatamente all’aspetto dialogico e prosodico/intonativo attraverso il training nella capacità di ascolto-produzione dei suoni e dei pattern intonativi con il supporto degli speech analysis tool e della multimedialità Web. L’ipotesi principale su cui ù basata la ricerca ù che questa tecnologia, spesso di tipo individuale, nelle classi di lingua ottimizzi le occasioni di apprendimento immersivo e di gruppo, recuperando ed estendendo modalità di conoscenza linguistica della L1 nella didattica della L2 e rispettando le indicazioni del natural approach. I risultati, raccolti durante la sperimentazione, usando materiali autentici con studenti principianti della National University of Ireland - Galway, indicano che l’uso di un feedback audio-visivo aiuta gli studenti a migliorare la loro produzione in LS e ad avvicinarsi alla frase obiettivo, grazie all’azione immediata di un’immagine delle differenze tra L1 e LS. Il campo di ricerca ù interdisciplinare sia all’interno della linguistica sia del Computer Assisted Language Learning. Il settore della sperimentazione riguarda l’insegnamento dell’italiano LS in modalità blended learning, a metà tra la formazione in presenza e quella a distanza, attraverso l’utilizzo integrato del metodo percettivo e di quello strumentale nell’acquisizione dell’intonazione.The presence of the Web and the introduction of immersive and collaborative computer technology in the teaching of foreign languages can help to re-create a relaxed cognitive environment and overcome the gap between formal and spontaneous learning. However, it is important to consider the 'mode' in which technologies help to create a learning environment in which the cognitive and social aspects of the human - machine interaction are intertwined to form a single operating environment. In this perspective the enveloping multimediality of the Web stimulates different types of awareness in the various parts of the brain slowly modifying processes of perception and cognitive strategies, and submitting them to constant stimuli from a multimedial habitat naturaliter. In this perspective, there is a close relationship between the grammars of the media, the functioning of the human sensory systems and the cognitive schemata that influence them. The aim of this research study is to apply collaborative computer technology and multimedial environments to adult Foreign Language (FL) and Second Language (SL) teaching (in particular Italian as a foreign language), in order to develop and enhance dialogical and prosodic/intonational awareness through training in listening/production of sounds and intonation patterns with the support of speech analysis tools. The main hypothesis on which the research is based, is that technology, normally used in individual training, can in fact maximize opportunities for immersive group learning, by recovering and extending modes of L1 language knowledge into SL teaching and learning, in accordance with Krashen’s theory of natural approach. The data collected during the trial, which used authentic oral texts with ab initio students from the National University of Ireland - Galway, indicate that the implementation of audio-visual feedback helps learn¬ers to improve their FL production and to get closer to the target utterance. This is done through the support of an immediate and easy-to-read visual image of the differences between L1 and FL. The field of research of this paper is interdisciplinary and involves both Linguistics and Computer Assisted Language Learning. The field of the experiment concerns the teaching of Italian as FL in blended learning, therefore halfway between face-to-face and distance learning, through the integrated use of the perceptive and the instrumental method in the acquisition of intonation
    corecore