208 research outputs found

    Data-Informed language learning

    Get PDF

    Designing, implementing, and evaluating an automated writing evaluation tool for improving EFL graduate students’ abstract writing: a case in Taiwan

    Get PDF
    Writing English research article (RA) abstracts is a difficult but mandatory task for Taiwanese engineering graduate students (Feng, 2013). Understanding the current situation and needs of Taiwanese engineering graduate students, this dissertation aimed to develop and evaluate an automated writing evaluation (AWE) tool to assist their research article (RA) abstract writing in English by following a Design-Based Research (DBR) approach as the methodological framework. DBR was chosen because it strives to solve real-world problems through multiple iterations of development and building on results from each iteration to advance the project. Six design iterations were undertaken to develop and to evaluate the AWE tool in this dissertation, including (1) corpus compilation of engineering RAs, (2) genre analysis of engineering abstracts, (3) machine learning of move classification in abstracts, (4) analysis of lexical bundles used to express moves, (5) analysis of the choice of verb categories associated with moves, and finally, (6) AWE tool development based on previous findings, classroom implementation, and evaluation of the AWE tool following Chapelle’s (2001) computer-assisted language learning (CALL) framework. To begin with, I collected a corpus of 480 engineering RAs (Corpus-480) to extract appropriate linguistic properties as pedagogical materials to be implemented in the AWE tool. A sub-corpus (Corpus-72) was compiled with 72 RAs randomly chosen from Corpus-480 for manual and automated analyses. Next, to seek the best descriptive framework for the structure of engineering RA abstracts, two move schemata were compared: (1) IMRD (Introduction, Methodology, Results, and Discussion) and (2) CARS (Create-A-Research-Space, Swales, 1990). Abstracts in Corpus-72 were annotated and these two schemas were evaluated according to three quantitative metrics devised specifically for this comparison. Applying a statistical natural language processing (StatNLP) approach, a Support Vector Machine (SVM) was trained for automated move classification in abstracts. Formulaic language in engineering RA sections was used as linguistic features to automatically classify moves in abstracts. Additionally, four-word lexical bundles and verb categories were identified from Corpus-480 and Corpus-72, respectively. Four-word lexical bundles associated with moves in abstracts were extracted automatically. Additionally, verb categories (i.e., tense, aspect, and voice) in moves of abstracts were identified using CyWrite::Analyzer, a hybrid (statistical and rule-based) NLP software. Finally, the AWE tool was developed, based on the findings from the previous iterations, and implemented in an English-as-a-foreign-language (EFL) classroom setting. Through analyzing students’ drafts before and after using the tool, and responses to a questionnaire and a semi-structured interview, the AWE tool was evaluated based on Chapelle’s (2001) CALL evaluation framework. The findings showed that students attempted to improve their abstracts by adding, deleting, or changing the sequences of their sentences, lexical bundles, and verb categories in their abstracts. Their attitudes toward the effectiveness and appropriateness of the tool were quite positive. Overall, the AWE tool drew students’ attention to the use of lexical bundles and verb categories to achieve the communicative purposes of each move in their abstracts. In conclusion, this dissertation started from Taiwanese engineering students’ needs to improve their English abstract writing, and attempted to develop and evaluate an AWE tool for assisting them. Following DBR, the findings from this dissertation are discussed to improve the next generation of the AWE tools. Having these iterations in place, future studies can focus on developing pedagogical materials from genre-based analysis in different disciplines to fulfill learners’ needs

    Potential of Automated Writing Evaluation Feedback

    Get PDF
    This paper presents an empirical evaluation of automated writing evaluation (AWE) feedback used for L2 academic writing teaching and learning. It introduces the Intelligent Academic Discourse Evaluator (IADE), a new web-based AWE program that analyzes the introduction section to research articles and generates immediate, individualized, and discipline-specific feedback. The purpose of the study was to investigate the potential of IADE’s feedback. A mixed-methods approach with a concurrent transformative strategy was employed. Quantitative data consisted of responses to Likert-scale, yes/no, and open-ended survey questions; automated and human scores for first and final drafts; and pre-/posttest scores. Qualitative data contained students’ first and final drafts as well as transcripts of think-aloud protocols and Camtasia computer screen recordings, observations, and semistructured interviews. The findings indicate that IADE’s colorcoded and numerical feedback possesses potential for facilitating language learning, a claim supported by evidence of focus on discourse form, noticing of negative evidence, improved rhetorical quality of writing, and increased learning gains

    The frequency and variability of conjunctive adjuncts in the Estonian–English Interlanguage Corpus

    Get PDF
    Magistritöö eesmärgiks oli luua Eesti esimene eesti–inglise vahekeele korpus ning tutvustada selle loomise- ning uurimispõhimõtteid. Kitsamalt uuriti sidesõnade variatiivsust ning sagedust. Tulemusi analüüsiti ning seejärel võrreldi inglise keelt emakeelena kõnelevate õppijate korpusega, milleks oli Michigan Corpus of Upper–level Student Papers (Michigani kõrgeima taseme kirjalike tööde õppijakorpus). Töö koosnes neljast osast. Magistritöö esimene ja teine osa keskendusid korpuse loomise põhimõtetele ning tutvustati ka korpusuurimuse ülesehitust. Arutleti selliste aspektide olulisuse üle nagu kvantiteet, kvaliteet, dokumentatsioon ning lihtsus. Igat aspekti analüüsiti, tuues välja tugevad ja nõrgad küljed ning võimalikud kitsaskohad. Magistritöö empiirilise osa läbiviimiseks (kolmas ja neljas osa) kasutati vabatarkvara AntConc, mis võimaldas luua statistilist andmestikku, mille tulemusi hiljem analüüsiti. Uuringutulemused näitasid, et Eesti õpilased kasutavad erinevaid sidesõnu, mis kuuluvad viide kategooriasse Halliday ja Hasani (1976) jaotuse järgi. Uurimustulemuste põhjal on näha, et Eesti õpilased on järjekindlad selliste sidesõnade kasutamisel nagu firstly, secondly, in conclusion ja to sum up. Uuringu käigus tuvastati järgmiste sidesõnade ülekasutus – but ja and. Sidesõna but kasutamist võib hinnata problemaatiliseks, sest õpilased eksisid korduvalt selle kasutamises (asetades sidesõna lause algusesse). Kokkuvõtteks võib öelda, et sidesõnade variatiivsuse õpetamine Eesti õpilastele aitaks kaasa koherentsuse tagamisel argumentatiivse teksti kirjutamisel. Abiks tuleks emakeelt kõnelevate õppijate korpusest sidesõnade laenamisest, sest seal oli üldine variatiivsus võrreldes eesti-inglise vahekeele korpusega suurem

    Language Learning Tasks and Automatic Analysis of Learner Language: Connecting FLTL and NLP design of ICALL materials supporting use in real-life instruction

    Get PDF
    This thesis studies the application of Natural Language Processing to Foreign Language Teaching and Learning, within the research area of Intelligent Computer- Assisted Language Learning (ICALL). In particular, we investigate the design, the implementation, and the use of ICALL materials to provide learners of foreign languages, particularly English, with automated feedback. We argue that the successful integration of ICALL materials demands a design process considering both pedagogical and computational requirements as equally important. Our investigation pursues two goals. The first one is to integrate into task design insights from Second Language Acquisition and Foreign Language Teaching and Learning with insights from computational linguistic modelling. The second goal is to facilitate the integration of ICALL materials in real-world instruction settings, as opposed to research or lab-oriented instruction settings, by empowering teachers with the methodology and the technology to autonomously author such materials. To achieve the first goal, we propose an ICALL material design process that combines basic principles of Task-Based Language Instruction and Task-Based Test Design with the specification requirements of Natural Language Processing. The relation between pedagogical and computational requirements is elucidated by exploring (i) the formal features of foreign language learning activities, (ii) the complexity and variability of learner language, and (iii) the feasibility of applying computational techniques for the automatic analysis and evaluation of learner responses. To achieve the second goal, we propose an automatic feedback generation strategy that enables teachers to customise the computational resources required to automatically correct ICALL activities without the need for programming skills. This proposal is instantiated and evaluated in real world-instruction settings involving teachers and learners in secondary education. Our work contributes methodologically and empirically to the ICALL field, with a novel approach to the design of materials that highlights the cross-disciplinary and iterative nature of the task. Our findings reveal the strength of characterising tasks both from the perspective of Foreign Language Teaching and Learning and from the perspective of Computational Linguistics as a means to clarify the nature of learning activities. Such a characterisation allows us to identify ICALL materials which are both pedagogically meaningful and computationally feasible. Our results show that teachers can characterise, author and employ ICALL mate- rials as part of their instruction programme, and that the underlying computational machinery can provide the required automatic processing with sufficient efficiency. The authoring tool and the accompanying methodology become a crucial instrument for ICALL research and practice: Teachers are able to design activities for their students to carry out without relying on an expert in Natural Language Processing. Last but not least, our results show that teachers are value the experience very positively as means to engage in technology integration, but also as a means to better apprehend the nature of their instruction task. Moreover, our results show that learners are motivated by the opportunity of using a technology that enhances their learning experience

    Analyzing Authentic Texts for Language Learning: Web-based Technology for Input Enrichment and Question Generation

    Get PDF
    Acquisition of a language largely depends on the learner's exposure to and interaction with it. Our research goal is to explore and implement automatic techniques that help create a richer grammatical intake from a given text input and engage learners in making form-meaning connections during reading. A starting point for addressing this issue is the automatic input enrichment method, which aims to ensure that a target structure is richly represented in a given text. We demonstrate the high performance of our rule-based algorithm, which is able to detect 87 linguistic forms contained in an official curriculum for the English language. Showcasing the algorithm's capability to differentiate between the various functions of the same linguistic form, we establish the task of tense sense disambiguation, which we approach by leveraging machine learning and rule-based methods. Using the aforementioned technology, we develop an online information retrieval system FLAIR that prioritizes texts with a rich representation of selected linguistic forms. It is implemented as a web search engine for language teachers and learners and provides effective input enrichment in a real-life teaching setting. It can also serve as a foundation for empirical research on input enrichment and input enhancement. The input enrichment component of the FLAIR system is evaluated in a web-based study that demonstrates that English teachers prefer automatic input enrichment to standard web search when selecting reading material for class. We then explore automatic question generation for facilitating and testing reading comprehension as well as linguistic knowledge. We give an overview of the types of questions that are usually asked and can be automatically generated from text in the language learning context. We argue that questions can facilitate the acquisition of different linguistic forms by providing functionally driven input enhancement, i.e., by ensuring that the learner notices and processes the form. The generation of well-established and novel types of questions is discussed and examples are provided; moreover, the results from a crowdsourcing study show that automatically generated questions are comparable to human-written ones

    Development and Pedagogical Applications of an Audio-Textual English-Spanish Parallel Literary Corpus for the Study of English Phonology

    Get PDF
    The field of Data-Driven Learning (DDL) an approach to second language learning in which the student interacts directly with corpus data has made much progress in only the matter of a few decades. However, there are still certain frontiers that have thus far remained underexplored, mostly the result of limited technological capabilities for a good portion of the fields existence. Until now, DDL has mainly centered on text corpora, leaving aside such aspects of language learning as oral comprehension and speech production. This doctoral dissertation presents the LITTERA corpus, and examines in depth how this English-Spanish parallel literary speech corpus can be applied to language learning within the framework of DDL. The dissertation begins with a general overview of the current state of DDL, followed by a detailed description of the creation and design of the LITTERA crorpus. Then a series of potential pedagogical exercises are presented, aimed at showing how LITTERA can be applied to the learning of English phonology by Spanish-speaking students. The exercises set out to examine how the different features of English prosodyco-articulatory phenomena such as linking, blending, assimilation, elision, resyllabfication, palatization, as well as vowel reductioncan be studied in the data to improve students oral comprehension and speech production. Furthermore, possible DDL question prompts are proposed to explore the different features in the classroom

    Taking an Effective Authorial Stance in Academic Writing: Inductive Learning for Second Language Writers using a Stance Corpus.

    Full text link
    The study focuses on a pedagogical proposal using a specialized corpus to assist advanced second language (L2) writers to tackle a critical aspect in academic writing, authorial stance-taking. Novice L2 writers are often found to project an unauthoritative persona and voice, which compromises their research potential. Premised on linguistic (Systemic Functional and Corpus Linguistics) and learning theories, the study tested the hypotheses that (1) rendering explicit the complex linguistic stance resources is beneficial to inform ways to argue with a persuasive authorial stance, and (2) corpora tools encourage inductive pattern-finding, which brings about deep learning and better performance. Seven Mandarin-speaking learners of English studying at post-graduate levels were recruited to engage in five-session training and interaction with the tool as they wrote introductions to their research. Multiple methods were employed to analyze (1) overall writing performance (2) stance understanding, (3) the development of stance understanding, (4) cognitive patterns in using the tool and in overall learning and (5) tool use patterns. Results show a positive relationship between stance learning and writing performance. The semantically-oriented corpus also engaged the learners’ to manage the discursive flow of their stance deployment, critical to greater effectiveness in their writing. Their cognitive learning patterns reflected frequent “sense-making”, “exploring” and “reasoning” about the complex new knowledge while inductive pattern-finding is infrequent. The findings suggest training and more support are need to optimize an inductive learning approach. In addition, individual learning styles and preconception are both critical in dictating learning outcome. Learners’ cognitive burden in learning complex linguistic concept using a second language also need to be considered. Finally, to encourage use of the corpus, more examples need to be included so that the users find convincing patterns of stance deployment which is relevant to their argumentative style and disciplinary background. To conclude, to better support academic writers, corpora tools show great potential in hosting examples at the discursive level to complement a lexico-grammatical approach in writing. Such tools, with annotation and textual enhancement, stimulate the writers’ attention to the structural and prosodic aspects of professional writings, an awareness needed for succeeding in a writing task.Ph.D.Education StudiesUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/77860/1/peichin_1.pd
    corecore