878 research outputs found

    Autenttisiin teksteihin perustuva tietokoneavusteinen kielen oppiminen: sovelluksia italian kielelle

    Get PDF
    Computer-Assisted Language Learning (CALL) is one of the sub-disciplines within the area of Second Language Acquisition. Clozes, also called fill-in-the-blank, are largely used exercises in language learning applications. A cloze is an exercise where the learner is asked to provide a fragment that has been removed from the text. For language learning purposes, in addition to open-end clozes where one or more words are removed and the student must fill the gap, another type of cloze is commonly used, namely multiple-choice cloze. In a multiple-choice cloze, a fragment is removed from the text and the student must choose the correct answer from multiple options. Multiple-choice exercises are a common way of practicing and testing grammatical knowledge. The aim of this work is to identify relevant learning constructs for Italian to be applied to automatic exercises creation based on authentic texts in the Revita Framework. Learning constructs are units that represent language knowledge. Revita is a free to use online platform that was designed to provide language learning tools with the aim of revitalizing endangered languages including several Finno-Ugric languages such as North Saami. Later non-endangered languages were added. Italian is the first majority language to be added in a principled way. This work paves the way towards adding new languages in the future. Its purpose is threefold: it contributes to the raising of Italian from its beta status towards a full development stage; it formulates best practices for defining support for a new language and it serves as a documentation of what has been done, how and what remains to be done. Grammars and linguistic resources were consulted to compile an inventory of learning constructs for Italian. Analytic and pronominal verbs, verb government with prepositions, and noun phrase agreement were implemented by designing pattern rules that match sequences of tokens with specific parts-of-speech, surfaces and morphological tags. The rules were tested with test sentences that allowed further refining and correction of the rules. Current precision of the 47 rules for analytic and pronominal verbs on 177 test sentences results in 100%. Recall is 96.4%. Both precision and recall for the 5 noun phrase agreement rules result in 96.0% in respect to the 34 test sentences. Analytic and pronominal verb, as well as noun phrase agreement patterns, were used to generate open-end clozes. Verb government pattern rules were implemented into multiple-choice exercises where one of the four presented options is the correct preposition and the other three are prepositions that do not fit in context. The patterns were designed based on colligations, combinations of tokens (collocations) that are also explained by grammatical constraints. Verb government exercises were generated on a specifically collected corpus of 29074 words. The corpus included three types of text: biography sections from Wikipedia, Italian news articles and Italian language matriculation exams. The last text type generated the most exercises with a rate of 19 exercises every 10000 words, suggesting that the semi-authentic text met best the level of verb government exercises because of appropriate vocabulary frequency and sentence structure complexity. Four native language experts, either teachers of Italian as L2 or linguists, evaluated usability of the generated multiple-choice clozes, which resulted in 93.55%. This result suggests that minor adjustments i.e., the exclusion of target verbs that cause multiple-admissibility, are sufficient to consider verb government patterns usable until the possibility of dealing with multiple-admissible answers is addressed. The implementation of some of the most important learning constructs for Italian resulted feasible with current NLP tools, although quantitative evaluation of precision and recall of the designed rules is needed to evaluate the generation of exercises on authentic text. This work paves the way towards a full development stage of Italian in Revita and enables further pilot studies with actual learners, which will allow to measure learning outcomes in quantitative term

    Learning from Partially Annotated Data: Example-aware Creation of Gap-filling Exercises for Language Learning

    Full text link
    Since performing exercises (including, e.g., practice tests) forms a crucial component of learning, and creating such exercises requires non-trivial effort from the teacher. There is a great value in automatic exercise generation in digital tools in education. In this paper, we particularly focus on automatic creation of gapfilling exercises for language learning, specifically grammar exercises. Since providing any annotation in this domain requires human expert effort, we aim to avoid it entirely and explore the task of converting existing texts into new gap-filling exercises, purely based on an example exercise, without explicit instruction or detailed annotation of the intended grammar topics. We contribute (i) a novel neural network architecture specifically designed for aforementioned gap-filling exercise generation task, and (ii) a real-world benchmark dataset for French grammar. We show that our model for this French grammar gap-filling exercise generation outperforms a competitive baseline classifier by 8% in F1 percentage points, achieving an average F1 score of 82%. Our model implementation and the dataset are made publicly available to foster future research, thus offering a standardized evaluation and baseline solution of the proposed partially annotated data prediction task in grammar exercise creation.Comment: 12 pages, Accepted in the 18th Workshop on Innovative Use of NLP for Building Educational Application

    Revita: a System for Language Learning and Supporting Endangered Languages

    Get PDF
    We describe a computational system for language learning and supporting endangered languages. The platform provides the user an opportunity to improve her competency through active language use. The platform currently works with several endangered Finno-Ugric languages, as well as with Yakut, and Finnish, Swedish, and Russian. This paper describes the current stage of ongoing development.Peer reviewe

    An Algorithm for Generating Gap-Fill Multiple Choice Questions of an Expert System

    Get PDF
    This research is aimed to propose an artificial intelligence algorithm comprising an ontology-based design, text mining, and natural language processing for automatically generating gap-fill multiple choice questions (MCQs). The simulation of this research demonstrated an application of the algorithm in generating gap-fill MCQs about software testing. The simulation results revealed that by using 103 online documents as inputs, the algorithm could automatically produce more than 16 thousand valid gap-fill MCQs covering a variety of topics in the software testing domain. Finally, in the discussion section of this paper we suggest how the proposed algorithm should be applied to produce gap-fill MCQs being collected in a question pool used by a knowledge expert system

    Generating Vocabulary Sets for Implicit Language Learning using Masked Language Modeling

    Get PDF
    A well-balanced language curriculum must include both explicit vocabulary learning and implicit vocabulary learning. However, most language learning applications focus on explicit instruction. Students require support with implicit vocabulary learning because they need enough context to guess and acquire new words. Traditional techniques aim to teach students enough vocabulary to comprehend the text, thus enabling them to acquire new words. Despite the wide variety of support for vocabulary learning offered by learning applications today, few offer guidance on how to select an optimal vocabulary study set. This paper proposes a novel method of student modeling with masked language modeling to detect words that are required for comprehension of a text. It explores the efficacy of using deep learning via a pre-trained masked language model to model human reading comprehension and presents a vocabulary study set generation pipeline (VSGP). Promising results show that masked language modeling can be used to model human comprehension and the pipeline produces reasonably sized vocabulary study sets that can be integrated into language learning systems

    AGReE: A system for generating Automated Grammar Reading Exercises

    Full text link
    We describe the AGReE system, which takes user-submitted passages as input and automatically generates grammar practice exercises that can be completed while reading. Multiple-choice practice items are generated for a variety of different grammar constructs: punctuation, articles, conjunctions, pronouns, prepositions, verbs, and nouns. We also conducted a large-scale human evaluation with around 4,500 multiple-choice practice items. We notice for 95% of items, a majority of raters out of five were able to identify the correct answer and for 85% of cases, raters agree that there is only one correct answer among the choices. Finally, the error analysis shows that raters made the most mistakes for punctuation and conjunctions.Comment: Accepted to EMNLP 2022 Demonstration Trac

    Revita: a Language-learning Platform at the Intersection of ITS and CALL

    Get PDF
    This paper presents Revita, a Web-based platform for language learning—beyond the beginner level. We anchor the presentation in a survey, where we review the literature about recent advances in the fields of computer-aided language learning (CALL) and intelligent tutoring systems (ITS). We outline the established desiderata of CALL and ITS and discuss how Revita addresses (the majority of) the theoretical requirements of CALL and ITS. Finally, we claim that, to the best of our knowledge, Revita is currently the only platform for learning/tutoring beyond the beginner level, that is functional, freely-available and supports multiple languages.Peer reviewe

    Potential of EPUB3 for Digital Textbooks in Higher Education

    Get PDF
    The e-book market is currently in a strong upswing. This research study deals with the question which practical uses the e-book format EPUB3 offers for (higher) education. By means of a didactic content analysis, a range of interactive exercise types were developed as a result of conversations with teachers. For this purpose, a didactic and technical concept has been developed. Different kinds of exercises were prototypically implemented in an e-book. Finally, a brief overview reflects the present state of the current e-book readers. A subsequent discussion illuminates the strengths and weaknesses of the format. In summary, it can be remarked that EPUB3 is suitable for a variety of different exercises and that it is able to serve as a basic format for forthcoming digital textbooks. Furthermore the openness of EPUB3 will assist Open Learning and Teaching in a meaningful way.Comment: Conference, 13 page
    corecore