7 research outputs found

    An online system for entering and annotating non-native Mandarin Chinese speech for language teaching

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (leaves 59-62).This thesis describes the design and implementation of an intuitive online system for the annotation of non-native Mandarin Chinese speech by native Chinese speakers. This system will allow speech recognition researchers to easily generate a corpus of labeled non-native speech. We have five native Chinese speakers test the annotation system on a sample bank of 250 Chinese utterances and observe fair to moderate inter-rater agreement scores. In addition to giving us a benchmark for inter-rater agreement, this also demonstrates the feasibility of having remote graders annotate sets of utterances. Finally, we extend our work to Chinese language instruction by creating a web-based interface for Chinese reading assignments. Our design is a simple, integrated solution for completing and correcting of spoken reading assignments, that also streamlines the compilation of a corpus of labeled non-native speech for use in future research.by Andrea Johanna Hawksley.M.Eng

    Engaging Stage1 students in Western Sydney Mandarin classes through pictographic characters : a unit of work for Stage 1 children

    Get PDF
    This thesis reports a case study conducted in the Western Sydney region, where the teacher-researcher teaches Mandarin in a Stage 1 class. The study aimed to construct a localised unit of work that focuses on Chinese pictographic characters to improve student engagement. It adopted suggestions from local teachers and then applied six lessons for a specific class within a Western Sydney public school context. Data were collected from interviews, reflective journals, formative assessments, post-it notes questions and students’ focus groups. The teacher-researcher concludes that the unit of work has many positive effects. Games, pictures and videos help to increase student engagement behaviourally, emotionally and cognitively. Further, the pictographs help students develop stronger memories of the Chinese characters they learned. However, future research could focus on the writing order and pronunciation of Hanzi

    Methods for pronunciation assessment in computer aided language learning

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 149-176).Learning a foreign language is a challenging endeavor that entails acquiring a wide range of new knowledge including words, grammar, gestures, sounds, etc. Mastering these skills all require extensive practice by the learner and opportunities may not always be available. Computer Aided Language Learning (CALL) systems provide non-threatening environments where foreign language skills can be practiced where ever and whenever a student desires. These systems often have several technologies to identify the different types of errors made by a student. This thesis focuses on the problem of identifying mispronunciations made by a foreign language student using a CALL system. We make several assumptions about the nature of the learning activity: it takes place using a dialogue system, it is a task- or game-oriented activity, the student should not be interrupted by the pronunciation feedback system, and that the goal of the feedback system is to identify severe mispronunciations with high reliability. Detecting mispronunciations requires a corpus of speech with human judgements of pronunciation quality. Typical approaches to collecting such a corpus use an expert phonetician to both phonetically transcribe and assign judgements of quality to each phone in a corpus. This is time consuming and expensive. It also places an extra burden on the transcriber. We describe a novel method for obtaining phone level judgements of pronunciation quality by utilizing non-expert, crowd-sourced, word level judgements of pronunciation. Foreign language learners typically exhibit high variation and pronunciation shapes distinct from native speakers that make analysis for mispronunciation difficult. We detail a simple, but effective method for transforming the vowel space of non-native speakers to make mispronunciation detection more robust and accurate. We show that this transformation not only enhances performance on a simple classification task, but also results in distributions that can be better exploited for mispronunciation detection. This transformation of the vowel is exploited to train a mispronunciation detector using a variety of features derived from acoustic model scores and vowel class distributions. We confirm that the transformation technique results in a more robust and accurate identification of mispronunciations than traditional acoustic models.by Mitchell A. Peabody.Ph.D

    Toward Widely-Available and Usable Multimodal Conversational Interfaces

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 159-166).Multimodal conversational interfaces, which allow humans to interact with a computer using a combination of spoken natural language and a graphical interface, offer the potential to transform the manner by which humans communicate with computers. While researchers have developed myriad such interfaces, none have made the transition out of the laboratory and into the hands of a significant number of users. This thesis makes progress toward overcoming two intertwined barriers preventing more widespread adoption: availability and usability. Toward addressing the problem of availability, this thesis introduces a new platform for building multimodal interfaces that makes it easy to deploy them to users via the World Wide Web. One consequence of this work is City Browser, the first multimodal conversational interface made publicly available to anyone with a web browser and a microphone. City Browser serves as a proof-of-concept that significant amounts of usage data can be collected in this way, allowing a glimpse of how users interact with such interfaces outside of a laboratory environment. City Browser, in turn, has served as the primary platform for deploying and evaluating three new strategies aimed at improving usability. The most pressing usability challenge for conversational interfaces is their limited ability to accurately transcribe and understand spoken natural language. The three strategies developed in this thesis - context-sensitive language modeling, response confidence scoring, and user behavior shaping - each attack the problem from a different angle, but they are linked in that each critically integrates information from the conversational context.by Alexander Gruenstein.Ph.D

    Language technologies in speech-enabled second language learning games : from reading to dialogue

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 237-244).Second language learning has become an important societal need over the past decades. Given that the number of language teachers is far below demand, computer-aided language learning software is becoming a promising supplement to traditional classroom learning, as well as potentially enabling new opportunities for self-learning. The use of speech technologies is especially attractive to offer students unlimited chances for speaking exercises. To create helpful and intelligent speaking exercises on a computer, it is necessary for the computer to not only recognize the acoustics, but also to understand the meaning and give appropriate responses. Nevertheless, most existing speech-enabled language learning software focuses only on speech recognition and pronunciation training. Very few have emphasized exercising the student's composition and comprehension abilities and adopting language technologies to enable free-form conversation emulating a real human tutor. This thesis investigates the critical functionalities of a computer-aided language learning system, and presents a generic framework as well as various language- and domain-independent modules to enable building complex speech-based language learning systems. Four games have been designed and implemented using the framework and the modules to demonstrate their usability and flexibility, where dynamic content creation, automatic assessment, and automatic assistance are emphasized. The four games, reading, translation, question-answering and dialogue, offer different activities with gradually increasing difficulty, and involve a wide range of language processing techniques, such as language understanding, language generation, question generation, context resolution, dialogue management and user simulation. User studies with real subjects show that the systems were well received and judged to be helpful.by Yushi Xu.Ph.D

    MULTIMEDIALITAÌ€ WEB E DIDATTICA DELLA PROSODIA

    Get PDF
    La presenza del Web e l’introduzione di tecnologie informatiche di tipo immersivo e collaborativo nella didattica delle lingue straniere servono a ricreare un ambiente cognitivo tipico di un contesto extra-scolastico, nel tentativo di superare quel gap esistente tra apprendimento formale e apprendimento spontaneo. E’ importante prestare attenzione alle ‘modalità’ con cui le tecnologie contribuiscono a creare un ambiente di apprendimento, in cui gli aspetti cognitivi e sociali dell’interazione uomo-macchina si intrecciano formando un unico contesto operativo. In questa prospettiva la multimedialità avvolgente del Web opera una sensibilizzazione diversa delle varie parti della macchina cerebrale modificandone lentamente procedure di percezione e strategie cognitive, quasi fosse lo stimolo costante da parte di un habitat naturaliter multimediale. Alla base di questa prospettiva c’è una relazione stretta tra grammatiche dei media, funzionamento delle sensorialità umane e schemi cognitivi che le influenzano. L’obiettivo del progetto di ricerca è quello di sperimentarne la ricaduta nella didattica della lingua seconda/straniera (L2/LS) ad adulti, in particolare dell’italiano come lingua straniera, limitatamente all’aspetto dialogico e prosodico/intonativo attraverso il training nella capacità di ascolto-produzione dei suoni e dei pattern intonativi con il supporto degli speech analysis tool e della multimedialità Web. L’ipotesi principale su cui è basata la ricerca è che questa tecnologia, spesso di tipo individuale, nelle classi di lingua ottimizzi le occasioni di apprendimento immersivo e di gruppo, recuperando ed estendendo modalità di conoscenza linguistica della L1 nella didattica della L2 e rispettando le indicazioni del natural approach. I risultati, raccolti durante la sperimentazione, usando materiali autentici con studenti principianti della National University of Ireland - Galway, indicano che l’uso di un feedback audio-visivo aiuta gli studenti a migliorare la loro produzione in LS e ad avvicinarsi alla frase obiettivo, grazie all’azione immediata di un’immagine delle differenze tra L1 e LS. Il campo di ricerca è interdisciplinare sia all’interno della linguistica sia del Computer Assisted Language Learning. Il settore della sperimentazione riguarda l’insegnamento dell’italiano LS in modalità blended learning, a metà tra la formazione in presenza e quella a distanza, attraverso l’utilizzo integrato del metodo percettivo e di quello strumentale nell’acquisizione dell’intonazione.The presence of the Web and the introduction of immersive and collaborative computer technology in the teaching of foreign languages can help to re-create a relaxed cognitive environment and overcome the gap between formal and spontaneous learning. However, it is important to consider the 'mode' in which technologies help to create a learning environment in which the cognitive and social aspects of the human - machine interaction are intertwined to form a single operating environment. In this perspective the enveloping multimediality of the Web stimulates different types of awareness in the various parts of the brain slowly modifying processes of perception and cognitive strategies, and submitting them to constant stimuli from a multimedial habitat naturaliter. In this perspective, there is a close relationship between the grammars of the media, the functioning of the human sensory systems and the cognitive schemata that influence them. The aim of this research study is to apply collaborative computer technology and multimedial environments to adult Foreign Language (FL) and Second Language (SL) teaching (in particular Italian as a foreign language), in order to develop and enhance dialogical and prosodic/intonational awareness through training in listening/production of sounds and intonation patterns with the support of speech analysis tools. The main hypothesis on which the research is based, is that technology, normally used in individual training, can in fact maximize opportunities for immersive group learning, by recovering and extending modes of L1 language knowledge into SL teaching and learning, in accordance with Krashen’s theory of natural approach. The data collected during the trial, which used authentic oral texts with ab initio students from the National University of Ireland - Galway, indicate that the implementation of audio-visual feedback helps learn¬ers to improve their FL production and to get closer to the target utterance. This is done through the support of an immediate and easy-to-read visual image of the differences between L1 and FL. The field of research of this paper is interdisciplinary and involves both Linguistics and Computer Assisted Language Learning. The field of the experiment concerns the teaching of Italian as FL in blended learning, therefore halfway between face-to-face and distance learning, through the integrated use of the perceptive and the instrumental method in the acquisition of intonation
    corecore