4,627 research outputs found

    Clasificación de errores gramaticales colocacionales en textos de estudiantes de español

    Get PDF
    Arbitrary recurrent word combinations (collocations) are a key in language learning. However, even advanced students have difficulties when using them. Efficient collocation aiding tools would be of great help. Still, existing “collocation checkers” still struggle to offer corrections to miscollocations. They attempt to correct without making any distinction between the different types of errors, providing, as a consequence, heterogeneous lists of collocations as suggestions. Besides, they focus solely on lexical errors, leaving aside grammatical ones. The former attract more attention, but the latter cannot be ignored either if the goal is to develop a comprehensive collocation aiding tool, able to correct all kinds of miscollocations. We propose an approach to automatically classify grammatical collocation errors made by US learners of Spanish as a starting point for the design of specific correction strategies targeted for each type of error.Las combinaciones recurrentes y arbitrarias de palabras (colocaciones) son clave para el aprendizaje de lenguas pero presentan dificultades incluso a los estudiantes m as avanzados. El uso de herramientas eficientes destinadas al aprendizaje de colocaciones supondría una gran ayuda, sin embargo, las que existen actualmente intentan corregir colocaciones erróneas sin diferenciar entre los distintos tipos de errores ofreciendo, como consecuencia, largas listas de colocaciones de muy diversa naturaleza. Además, sólo se consideran los errores léxicos, dejando de lado los gramaticales que, aunque menos frecuentes, no pueden ignorarse si el objetivo es desarrollar una herramienta capaz de corregir cualquier colocación errónea. En el presente trabajo se propone un método de clasificación automática de errores colocacionales gramaticales cometidos por estudiantes de español estadounidenses, como punto de partida para el diseño de estrategias de corrección específicas para cada tipo de error.This work has been funded by the Spanish Ministry of Science and Competitiveness (MINECO), through a predoctoral grant with reference BES-2012-057036, in the framework of the project HARenES, under the contract number FFI2011-30219-C02-02

    A corpus-based study of Malaysia ESL students' use of discourse connectors in upper and post-secondry argumentative writing

    Get PDF
    Discourse connectors (DCs) are one of the elements of cohesive devices that bring about cohesion to a piece of writing or speech. They are potentially useful means for writers,particularly in ESL and EFL writing pedagogic settings. DCs usefulness is two-pronged. First, they help and guide readers through the text, and then they are tools for writers to engage with their readers. It has been well-documented that appropriate and efficient use of DCs will create a coherent flow of the text. However, second/foreign language learners have some difficulties to use them efficiently and systematically in their writing. Literature review shows that the Malaysian ESL students are also suffering from improper and efficient use of DCs which leads them in failing to produce a cohesive text. Surprisingly, no single study was found in the context of Malaysia to investigate Malaysian ESL students’ understanding and use of DCs. Hence, this study attempted to investigate and understand the nature and the use of the DCs in the Malaysian student writing compared with Native speakers writing. The study also was set to examine the correlation between the frequency use of the DCs and the quality of writing. The final goal of this research study was to find out to what extent Malaysian ESL students are committing errors while using DCs. A corpus-based approach was adopted to meet the objectives of the study. To this end,an argumentative topic was assigned to the Form 4, Form 5 (upper-secondary) and the first year college students (post-secondary) and they were asked to write about the given topic in the classroom and submit their works to the instructors. They were required to write 250 words within 45 minutes. Upon compilation of the essays, the Malaysian Corpus of Students' Argumentative Writing (MCSAW) was built with ≃ 600,000 tokens. To compare and find out a vivid picture of Malaysian ESL students use of DCs with Native English Speakers, the Louvain Corpus of Native Essay Writing [LOCNESS] corpus was used. Oxford Wordsmith Tools (5) was employed to extract data from corpus for analysis, by using frequency count and concordance functions. Aiming to identify what type of DCs is used by Malaysian ESL students, Discourse Connector List developed by Rezvani Kalajahi and Neufeld (2014) was used. To be able to examine the relationship between the quality of writing and the frequency of the use of the DCs, ESL composition profile offered by Jacobs et al. (1981) was utilized. Finally, a framework of identification of DCs error type was developed by the researcher to explore the errors that students commit while using DCs. Findings of this study entail three phases. First, it was observed that Malaysian students tend to use DCs more frequently than native students. The overall frequency of the use of the DCs between Malaysian and native students was statistically significant at p < .05. However, the native students used more variety of DCs types than Malaysian students (398vs.328). It was also found that Malaysian students use DCs in some categories frequently and infrequently. Based on the findings in the native students writing (LOCNESS Corpus), the most frequent DCs in written English were offered. Second,there was a very weak negative but insignificant correlation between writing quality and the frequency of the use of the DCs in the writing of Malaysian ESL students. Finally,the qualitative analysis revealed that the erroneous use of DCs made by Malaysian ESL student writers mainly manifested in eight different categories. They had problems with the use of these devices which involved semantic, syntactic, stylistic, positional and mechanical errors. They also appeared to have tendency for unnecessary addition, omission, and redundant repetition of the DCs. In conclusion, this study demonstrated that Malaysian ESL students’ use of DCs was still at an evolving level. It is vitally important that the accurate use of DCs in writing among Malaysian students be further highlighted in the classrooms through using concordance lines and adopting explicit instruction technique. Besides, material developers may take the outcome of the research into consideration and could find out possible ways to distribute and introduce DCs systematically across the educational levels

    Unravelling Interlanguage Facts via Explainable Machine Learning

    Full text link
    Native language identification (NLI) is the task of training (via supervised machine learning) a classifier that guesses the native language of the author of a text. This task has been extensively researched in the last decade, and the performance of NLI systems has steadily improved over the years. We focus on a different facet of the NLI task, i.e., that of analysing the internals of an NLI classifier trained by an \emph{explainable} machine learning algorithm, in order to obtain explanations of its classification decisions, with the ultimate goal of gaining insight into which linguistic phenomena ``give a speaker's native language away''. We use this perspective in order to tackle both NLI and a (much less researched) companion task, i.e., guessing whether a text has been written by a native or a non-native speaker. Using three datasets of different provenance (two datasets of English learners' essays and a dataset of social media posts), we investigate which kind of linguistic traits (lexical, morphological, syntactic, and statistical) are most effective for solving our two tasks, namely, are most indicative of a speaker's L1. We also present two case studies, one on Spanish and one on Italian learners of English, in which we analyse individual linguistic traits that the classifiers have singled out as most important for spotting these L1s. Overall, our study shows that the use of explainable machine learning can be a valuable tool for t

    Corrective Feedback in the EFL Classroom: Grammar Checker vs. Teacher’s Feedback.

    Get PDF
    The aim of this doctoral thesis is to compare the feedback provided by the teacher to that obtained by the software called Grammar Checker on grammatical errors in the written production of English as a foreign language students. Traditionally, feedback has been considered as one of the three theoretical conditions for language learning (along with input and output) and, for this reason, extensive research has been carried out on who should provide it, when and the level of explicitness. However, there are far fewer studies that analyse the use of e-feedback programs as a complement or alternative to those offered by the teacher. Participants in our study were divided into two experimental groups and one control group, and three grammatical aspects that are usually susceptible to error in English students at B2 level were examined: prepositions, articles, and simple past-present/past perfect dichotomy. All participants had to write four essays. The first experimental group received feedback from the teacher and the second received it through the Grammar Checker program. The control group did not get feedback on the grammatical aspects of the analysis but on other linguistic forms not studied. The results obtained point, first of all, to the fact that the software did not mark grammatical errors in some cases. This means that students were unable to improve their written output in terms of linguistic accuracy after receiving feedback from the program. In contrast, students who received feedback from the teacher did improve, although the difference was not significant. Second, the two experimental groups outperformed the control group in the use of the grammatical forms under analysis. Thirdly, regardless of the feedback offered, the two groups showed improvement in the use of grammatical aspects in the long term, and finally, no differences in attitude towards the feedback received and its impact on the results were found in either of the experimental groups. Our results open up new lines for investigating corrective feedback in the English as a foreign language classroom, since more studies are needed that, on the one hand, influence the improvement of electronic feedback programs by making them more accurate and effective in the detection of errors. On the other hand, software such as Grammar Checker can be a complement to the daily practice of the foreign language teacher, helping in the first instance to correct common and recurring mistakes, even more so when our research has shown that attitudes towards this type of electronic feedback are positive and does not imply an intrusion into the classroom, thus helping in the acquisition of the English language.Programa de Doctorat en Llengües Aplicades, Literatura i Traducci

    L1 Influence on the Acquisition Order of English Grammatical Morphemes

    Get PDF
    We revisit morpheme studies to evaluate the long-standing claim for a universal order of acquisition. We investigate the L2 acquisition order of six English grammatical morphemes by learners from seven L1 groups across five proficiency levels. Data are drawn from approximately 10,000 written exam scripts from the Cambridge Learner Corpus. The study establishes clear L1 influence on the absolute accuracy of morphemes and their acquisition order, therefore challenging the widely held view that there is a universal order of acquisition of L2 morphemes. Moreover, we find that L1 influence is morpheme specific, with morphemes encoding language-specific concepts most vulnerable to L1 influence.EF Education First ResearchThis is the author accepted manuscript. The final version is available from Cambridge University Press via http://dx.doi.org/10.1017/S027226311500035

    An Analysis of Language Frequency and Error Correction for Esperanto

    Full text link
    Current Grammar Error Correction (GEC) initiatives tend to focus on major languages, with less attention given to low-resource languages like Esperanto. In this article, we begin to bridge this gap by first conducting a comprehensive frequency analysis using the Eo-GP dataset, created explicitly for this purpose. We then introduce the Eo-GEC dataset, derived from authentic user cases and annotated with fine-grained linguistic details for error identification. Leveraging GPT-3.5 and GPT-4, our experiments show that GPT-4 outperforms GPT-3.5 in both automated and human evaluations, highlighting its efficacy in addressing Esperanto's grammatical peculiarities and illustrating the potential of advanced language models to enhance GEC strategies for less commonly studied languages

    WRITTEN CORRECTIVE FEEDBACK: EFFECTS OF FOCUSED AND UNFOCUSED GRAMMAR CORRECTION ON THE CASE ACQUISITION IN L2 GERMAN

    Get PDF
    Thirty-three students of fourth semester German at the University Kansas participated in the study which sought to investigate whether focused written corrective feedback (WCF) promoted the acquisition of the German case morphology over the course of a semester. Participants received teacher WCF on five two-draft essay assignments under three treatment conditions: Group (1) received focused WCF on German case errors; group (2) received unfocused WCF on a variety of German grammar errors; and group (3) did not receive WCF on specific grammar errors. Combining quantitative and qualitative analyses, the study found that the focused group improved significantly in the accuracy of case forms while the unfocused and the control group did not make any apparent progress. The results indicate that focused WCF was effective in improving case accuracy in subjects' writings in German as a foreign language (GFL) context. WCF did not negatively affect writing fluency or students' attitude toward writing

    Grammatical Error Correction: A Survey of the State of the Art

    Full text link
    Grammatical Error Correction (GEC) is the task of automatically detecting and correcting errors in text. The task not only includes the correction of grammatical errors, such as missing prepositions and mismatched subject-verb agreement, but also orthographic and semantic errors, such as misspellings and word choice errors respectively. The field has seen significant progress in the last decade, motivated in part by a series of five shared tasks, which drove the development of rule-based methods, statistical classifiers, statistical machine translation, and finally neural machine translation systems which represent the current dominant state of the art. In this survey paper, we condense the field into a single article and first outline some of the linguistic challenges of the task, introduce the most popular datasets that are available to researchers (for both English and other languages), and summarise the various methods and techniques that have been developed with a particular focus on artificial error generation. We next describe the many different approaches to evaluation as well as concerns surrounding metric reliability, especially in relation to subjective human judgements, before concluding with an overview of recent progress and suggestions for future work and remaining challenges. We hope that this survey will serve as comprehensive resource for researchers who are new to the field or who want to be kept apprised of recent developments

    한국 대학생들의 논증적 에세이에 나타난 절과 구 복잡성의 발달

    Get PDF
    학위논문(석사) -- 서울대학교대학원 : 사범대학 외국어교육과(영어전공), 2023. 2. 오선영.영어 글쓰기 발달에 관한 연구들은 문법적 복잡성(grammatical complexity)을 학습자의 능숙도를 구별하는 중요한 지표로 인식하고 있다. 초기 연구들은 주로 절 복잡성(clausal complexity)에 기반해 문법적 복잡성을 측정하였지만, 최근 연구들은 구 복잡성(phrasal complexity)에 초점을 두고 있다. 이러한 변화는 절 복잡성이 일상 대화가 가진 특징으로 글쓰기의 초기 발달 단계를 나타내는 반면, 구 복잡성, 특히 명사구의 복잡성은 학문적 글(academic writing)이 가진 복잡성의 전형으로써 높은 수준의 발달 단계를 나타낸다는 인식에 기반하고 있다. 하지만 일부 연구들은 명사구의 복잡성이 글쓰기 능숙도와 큰 관련이 없다는 상반된 결과를 보이고 있는데, 이는 대부분의 연구들이 학습자 모국어가 문법적 복잡성에 미치는 영향을 고려하지 않고 다양한 모국어를 가진 학습자들에 의해 만들어진 코퍼스를 사용했기 때문일 수 있다. 이에 본 연구는 한국인 대학생들이 작성한 글을 분석하여 절과 구의 복잡성이 글쓰기 능숙도와 연관성이 있는지 살펴보고, 그러한 연관성에 크게 기여한 복잡성 특징들을 바탕으로 문법적 복잡성의 발달 패턴을 추정하고자 하였다. 또한 학생들의 글을 질적으로 분석하여, 특정 복잡성 특징을 구현할 때 자주 쓰이는 어휘와 오류 빈도 및 유형을 파악함으로써 능숙도 집단 간의 차이를 더 자세히 묘사하고자 하였다. 본 연구에 사용된 코퍼스는 연세 영어 학습자 코퍼스(Yonsei English Learner Corpus, YELC 2011)에서 추출한 234개의 논증적 에세이로 구성되어 있으며, 이는 CEFR에 기반하여 초급, 중급, 고급의 글쓰기 능숙도를 나타내는 세 개의 하위 코퍼스로 구분되었다. 품사 태깅된 코퍼스를 바탕으로 정규표현식(regular expressions)을 사용하여, Biber et al. (2011)이 제안한 발달단계에 있는 9개의 절 복잡성 특징과 8개의 구 복잡성 특징을 추출하여 각각의 빈도를 계산하였다. 피어슨 카이제곱검정(a Pearson Chi-square test) 결과, 글쓰기 능숙도가 절과 구의 복잡성과 유의한 연관성이 있다는 결론이 도출되었다. 사후검정으로 잔차 분석(a residual analysis)을 수행한 결과, 특히 5개 복잡성 특징이 이러한 연관성에 크게 기여했음이 밝혀졌다. 주목할 만한 발견은 각 능숙도 집단의 주요 복잡성 특징이 Biber et al. (2011)이 제안한 발달단계와 일치하며 따라서 한국인 대학생의 발달 패턴이 두 개의 매개변수, 즉 (1) 구조적 형태와 (2) 통사적 기능에 의해 설명될 수 있다는 점이다. 즉, 한국 대학생들의 문법적 복잡성은 (i) 절의 구성 성분으로 기능하는 정형 종속절(finite dependent clauses functioning as clause constituents)인 부사절의 빈번한 사용에서 (ii) 명사구의 구성 성분으로 기능하는 정형 종속절(finite clause types function as NP constituents)인 WH 관계절에 대한 의존을 거쳐 (iii) 명사구의 구성 성분으로 기능하는 종속구(dependent phrasal structures functioning as noun phrase constituents)인 of 전치사구에 대한 선호로 발달하는 것으로 나타났다. 예상과 달리, 명사의 선수식어(premodifier)로 사용되는 형용사 및 명사의 빈도는 글쓰기 능숙도와 큰 연관성이 없는 것으로 나타났다. 이에 관해 학생들의 글을 질적 분석한 결과, 첫째, 초급 수준의 글은 쓰기 지시문(writing prompts)에 제시된 형용사+명사 조합을 반복적으로 사용하는 경향을 보였다. 둘째, 명사+명사 구조와 관련한 오류가 능숙도가 높아질수록 현저히 낮아지는 경향을 보였다. 마지막으로, 보어절(complement clauses)과 관련해서는 모든 능숙도 수준의 학생들이 매우 한정적인 종류의 통제 명사(controlling nouns)를 사용했으며, 학문적인 글 보다는 일상 대화에서 쓰이는 통제 동사(controlling verbs)를 사용하였다. 이러한 연구 결과는 크게 세가지 교육적 함의를 시사한다. 첫째, 경험적으로 도출된 문법적 복잡성의 발달 단계를 상세한 평가 척도 설명자(rating scale descriptors) 개발과 보다 맞춤화 된 수업 설계를 위해 활용해야 한다. 둘째, 학문적인 글에서 보어절과 함께 자주 사용되는 통제 명사 및 동사에 대한 교실 수업을 통해, 학습자들이 문법적 구조를 학문적인 어휘로 실현할 수 있도록 해야 한다. 마지막으로, 특히 명사를 선수식하는 명사 및 관계대명사절의 사용에 있어 학습자의 글에서 자주 발견되는 오류를 시정함으로써, 문법 구조 사용에 대한 정확성을 향상시켜야 한다.Studies that explore L2 writing development identify grammatical complexity as a primary discriminator for different proficiency levels of L2 writers. In the 1990s, grammatical complexity in L2 writing was often measured by clausal complexity, but the kind of complexity that has recently received particular attention is phrasal complexity. Such a move follows the recognition that clausal complexity represents the complexity of conversation and beginning levels of writing development, whereas phrasal complexity, specifically noun phrase complexity, represents the complexity of academic writing and advanced developmental levels. Some L2 writing studies, however, have yielded conflicting results, showing that phrasal features as noun modifiers have little predictive power for writing quality. One possible reason underlying these inconsistent results might be that most studies in this area have used corpus data from learners of heterogenous L1 backgrounds with no consideration for the significant effect of L1 on the use of complexity features in L2 writing. Thus, this study analyzed essay samples produced only by L1 Korean writers to investigate whether clausal and phrasal complexity is associated with L2 writing proficiency and, if so, what developmental patterns can be observed based on complexity features that contribute substantially to the association. A qualitative analysis of student writing was followed up to provide a detailed description of proficiency-level differences, especially with respect to lexical realizations and error types associated with specific complexity features. The corpus used in the present study contained 234 argumentative essays written by first-year college students, including 78 low-rated essays (A1 and A1+ levels of the CEFR), 78 mid-rated essays (B1 and B1+ levels of the CEFR), and 78 high-rated essays (B2+, C1, and C2 levels of the CEFR). Drawing on Biber et al.s (2011) developmental index, the nine clausal and eight phrasal complexity features were extracted from the tagged corpus using regular expressions to measure the frequency of each feature. The result of a Pearson Chi-square test demonstrated a statistically significant association between the three proficiency levels and the use of clausal and phrasal complexity features. The post-hoc residual analysis revealed five complexity features with great contribution to the association: finite adverbial clause, noun complement clause, WH relative clause, prepositional phrase (of), and prepositional phrase (other). Especially noteworthy is the finding that the main source of complexity at each proficiency level agrees with its corresponding developmental stage reported by Biber et al. (2011), and thus, developmental patterns for Korean college students are successfully explained by two parameters: (1) structural form (finite dependent clauses vs. dependent phrases) and (2) syntactic function (clause constituents vs. noun phrase constituents). Specifically, the development proceeds from (i) clausal complexity mainly via finite adverbial clauses (i.e., finite dependent clauses functioning as clause constituents); through (ii) the intermediate stage of heavy reliance on WH relative clauses (i.e., finite clause types functioning as noun phrase constituents); to finally (iii) phrasal complexity primarily via prepositional phrases (of) (i.e., phrasal structures functioning as noun phrase constituents). Surprisingly, premodifying adjectives and nouns were found to have no significant association with L2 writing proficiency despite being noun-modifying phrasal features. The subsequent qualitative analysis of student writing, however, illustrated greater proficiency of the highly rated essays in using these features in two regards. First, the lower-rated essays drew much more heavily on adjective-noun sequences presented in writing prompts than the higher-rated essays. Second, the number of errors in the composition of noun-noun sequences noticeably decreased in the higher-rated essays. The qualitative observation concerning that-complement clauses, on the other hand, identified the reliance on a limited set of controlling nouns and conversational styles of controlling verbs in student writing across proficiency levels. Three main pedagogical implications are provided based on the findings: (i) the use of empirically derived developmental stages to create detailed rating scale descriptors and provide more customized writing courses on the use of complexity features; (ii) the need for classroom instruction on common academic controlling nouns and verbs used in that complement clauses given the importance of academically oriented lexical realizations of grammatical structures; and (iii) the need to address recurrent errors, particularly in terms of using premodifying nouns and relative clauses.CHAPTER 1. INTRODUCTION 1 1.1 Background of the Study 1 1.2 Purpose of the Study 4 1.3 Research Questions 5 1.4 Organization of the Thesis 6 CHAPTER 2. LITERATURE REVIEW 8 2.1 Grammatical Complexity in L2 Writing 8 2.1.1 Definition of Grammatical Complexity 9 2.1.2 Grammatical Complexity in L2 Writing Studies 13 2.2 Criticism of Traditional Measures of Grammatical Complexity 15 2.2.1 Reductiveness and Redundancy of Length- and Subordination-based Measures 16 2.2.2 Inappropriateness of the T-unit Approach to the Assessment of Writing Development 21 2.3 Measures of Grammatical Complexity in L2 Writing 24 2.3.1 Clausal and Phrasal Complexity in Relation to L2 Writing Development 25 2.3.2 Studies on Clausal and Phrasal Complexity in L2 Writing 31 2.4 Variation in the Use of Grammatical Complexity Features 36 2.4.1 The Effect of L1 Background 37 2.4.2 The Effect of Genre 43 2.4.3 The Effect of Timing Condition 46 CHAPTER 3. METHODOLOGY 50 3.1 Learner Corpus 50 3.1.1 Description of YELC 2011 50 3.1.2 Description of a Subset of YELC 2011 used in the Study 53 3.2 Grammatical Complexity Measures 55 3.3 Corpus Tagging and Automatic Extraction 59 3.4 Data Analysis 65 CHAPTER 4. RESULTS AND DISCUSSION 70 4.1 Descriptive Statistics 70 4.2 The Association between L2 Writing Proficiency and Grammatical Complexity 76 4.3 The Developmental Patterns of Grammatical Complexity 77 4.4 The Grammatical Complexity Features with Great Contribution to the Association 84 4.4.1 Finite Adverbial Clauses 84 4.4.2 Prepositional Phrases as Nominal Postmodifiers 92 4.4.3 WH Relative Clauses 100 4.4.4 Finite Complement Clauses Controlled by Nouns 106 4.5 The Grammatical Complexity Features with Little Contribution to the Association 112 4.5.1 Premodifying Adjectives 113 4.5.2 Nouns as Nominal Premodifiers 120 4.5.3 Finite Complement Clauses Controlled by Verbs or Adjectives 125 CHAPTER 5. CONCLUSION 136 5.1 Major Findings 136 5.2 Pedagogical Implications 139 5.3 Limitations and Prospect for Future Research 142 REFERENCES 145 APPENDICES 161 ABSTRACT IN KOREAN 165석
    corecore