2,242 research outputs found

    Citation practice in the whole TESOL master’s theses by Vietnamese postgraduates

    Get PDF
    Citing previous works is an important rhetorical feature of academic writing and it is challenging for novice writers, especially non-native English writers (NNEWs). However, little is known about how NNEWs cite in each chapter of their master’s (M.A.) theses. This paper thus reports on the citation practice in 24 TESOL M.A. theses written by Vietnamese students. Citation types were first searched on the Antconc software with the use of the Regular Expressions (Regex) written for both conventional and ‘invented’ citing ways by this group of writers, and then based on Thompson and Tribble’s (2001) framework, citation functions were investigated and classified. Semi-structured interviews were also conducted with thesis writers and thesis supervisors. Besides the general citation practice by this group of NNEWs, and the different citation functions and types in different chapters of their theses, the study also found that these writers were not fully aware of the significance of citations as a rhetorical device in their thesis writing, and insufficient attention was paid to the in-text citations in the TESOL discourse community in Vietnam. These findings suggest explicit instructions on citations in order to help novice writers to fully acquire the citation use

    The Effect of Thesis Writing on Paraphrasing Ability of the EFL Alumni of the University of Mataram Lombok

    Get PDF
    Until recently, no study focused on analyzing the effect thesis writing program on paraphrasing ability of the alumni. Generally, some studies focused on the reversed direction, that is, the effect of paraphrasing ability on thesis writing. This is the novelty of the present study. The present study aimed at testing the effect of thesis writing program at the end of the EFL study on paraphrasing ability of the alumni an EFL education, identifying the types of paraphrasing, and exploring weaknesses in paraphrasing and causes of not paraphrasing. This evaluative ex-post facto research employed Mixed-methods. The participants were 68 alumni of the University of Mataram Indonesia, those who undertook thesis writing program during their study in EFL education and the others who did not write undergraduate thesis. They were selected purposively from 37 schools in West Nusa Tenggara province. Data were collected with writing tasks, questionnaire, interview, and recording. The data were analyzed quantitatively and qualitatively. It shows: 1) The level of the alumni’s paraphrasing ability is ‘medium’; 2) Thesis writing program affects paraphrasing ability of the EFL alumni; 3) Synonym and Change of Word Orders are the dominant techniques; 4) The teachers’ weaknesses involve lack of vocabulary, limited conversions, deviation from the authentic ideas, summarizing, and unclear paraphrasing, 5) The causes of not paraphrasing include limited knowledge of paraphrasing and grammatical understanding. It is suggested that teacher education institutions implement curriculums that support teachers’ writing skills. In turn, plagiarism could be minimized which leads to the production of teachers’ quality academic writing

    MEGA: Multilingual Evaluation of Generative AI

    Full text link
    Generative AI models have shown impressive performance on many Natural Language Processing tasks such as language understanding, reasoning, and language generation. An important question being asked by the AI community today is about the capabilities and limits of these models, and it is clear that evaluating generative AI is very challenging. Most studies on generative LLMs have been restricted to English and it is unclear how capable these models are at understanding and generating text in other languages. We present the first comprehensive benchmarking of generative LLMs - MEGA, which evaluates models on standard NLP benchmarks, covering 16 NLP datasets across 70 typologically diverse languages. We compare the performance of generative LLMs including Chat-GPT and GPT-4 to State of the Art (SOTA) non-autoregressive models on these tasks to determine how well generative models perform compared to the previous generation of LLMs. We present a thorough analysis of the performance of models across languages and tasks and discuss challenges in improving the performance of generative LLMs on low-resource languages. We create a framework for evaluating generative LLMs in the multilingual setting and provide directions for future progress in the field.Comment: EMNLP 202

    “All These Nouns Together Just Don’t Make Sense!”: An Investigation of EAP Students’ Challenges with Complex Noun Phrases in First-Year College-Level Textbooks

    Get PDF
    Complex noun phrases (CNP) are a major vehicle of academic written discourse (Halliday, 1988; 2004). However, in spite of the view that they pose significant challenges to English language learners, they are often overlooked in preparatory English for Academic Purposes (EAP) programs. This mixed methods study aims to investigate to what extent CNP present syntactic parsing challenges for upper-level college EAP students, and whether there is a perceived need for direct instruction in CNP in EAP programs. A special CNP proficiency test was administered to 70 upper-level Ontario college EAP students and a native speaker comparator group, and the results were compared with those obtained from interviews with seven of the test-takers. The results obtained from the statistical analyses and the interviews indicate that CNP are challenging to parse for upper-level EAP students and that direct instruction in CNP may be beneficial for improving their reading comprehension. Some teaching implications of the findings are also addressed.Les groupes nominaux complexes (GNC) sont un vecteur important du discours Ă©crit universitaire (Halliday, 1988; 2004). Cependant, bien qu’on admette les difficultĂ©s qu’ils posent aux apprenant.e.s d’anglais, les GNC sont souvent peu pris en compte par les programmes prĂ©paratoires d'anglais sur objectifs universitaires (English for Academic Purposes ou EAP). Cette Ă©tude Ă  mĂ©thodologie mixte vise Ă  dĂ©terminer dans quelle mesure a) les GNC prĂ©sentent des dĂ©fis d'analyse syntaxique pour les Ă©tudiant.e.s de l’enseignement collĂ©gial postsecondaire inscrit.e.s Ă  des cours EAP avancĂ©s, et b) un enseignement explicite des GNC est perçu comme nĂ©cessaire. Un test de compĂ©tence spĂ©cifique aux GNC a Ă©tĂ© administrĂ© Ă  70 Ă©tudiant.e.s de cours EAP avancĂ©s d’un collĂšge de l'Ontario et Ă  un groupe comparatif composĂ© de locuteurs natifs; les rĂ©sultats au test ont Ă©tĂ© triangulĂ©s par le moyen d’entretiens avec sept participants. Les rĂ©sultats obtenus Ă  partir des analyses statistiques des tests et des entretiens indiquent que les GNC sont difficiles Ă  analyser pour les Ă©tudiant.e.s des cours EAP avancĂ©s, et que l'enseignement explicite des GNC pourrait permettre d’amĂ©liorer leur comprĂ©hension en lecture. Des pistes pĂ©dagogiques dĂ©coulant des rĂ©sultats sont Ă©galement abordĂ©es

    NeCo@ALQAC 2023: Legal Domain Knowledge Acquisition for Low-Resource Languages through Data Enrichment

    Full text link
    In recent years, natural language processing has gained significant popularity in various sectors, including the legal domain. This paper presents NeCo Team's solutions to the Vietnamese text processing tasks provided in the Automated Legal Question Answering Competition 2023 (ALQAC 2023), focusing on legal domain knowledge acquisition for low-resource languages through data enrichment. Our methods for the legal document retrieval task employ a combination of similarity ranking and deep learning models, while for the second task, which requires extracting an answer from a relevant legal article in response to a question, we propose a range of adaptive techniques to handle different question types. Our approaches achieve outstanding results on both tasks of the competition, demonstrating the potential benefits and effectiveness of question answering systems in the legal field, particularly for low-resource languages.Comment: ISAILD@KSE 202

    What are Automated Paraphrasing Tools and how do we address them? A review of a growing threat to academic integrity

    Get PDF
    This article reviews the literature surrounding the growing use of Automated Paraphrasing Tools (APTs) as a threat to educational integrity. In academia there is a technological arms-race occurring between the development of tools and techniques which facilitate violations of the principles of educational integrity, including text-based plagiarism, and methods for identifying such behaviors. APTs are part of this race, as they are a rapidly developing technology which can help writers transform words, phrases, and entire sentences and paragraphs at the click of a button. This article seeks to review the literature surrounding the history of APT use and the current understanding of APTs placed in the broader context of the educational integrity-technology arms race

    On the Cross-lingual Transferability of Monolingual Representations

    Full text link
    State-of-the-art unsupervised multilingual models (e.g., multilingual BERT) have been shown to generalize in a zero-shot cross-lingual setting. This generalization ability has been attributed to the use of a shared subword vocabulary and joint training across multiple languages giving rise to deep multilingual abstractions. We evaluate this hypothesis by designing an alternative approach that transfers a monolingual model to new languages at the lexical level. More concretely, we first train a transformer-based masked language model on one language, and transfer it to a new language by learning a new embedding matrix with the same masked language modeling objective, freezing parameters of all other layers. This approach does not rely on a shared vocabulary or joint training. However, we show that it is competitive with multilingual BERT on standard cross-lingual classification benchmarks and on a new Cross-lingual Question Answering Dataset (XQuAD). Our results contradict common beliefs of the basis of the generalization ability of multilingual models and suggest that deep monolingual models learn some abstractions that generalize across languages. We also release XQuAD as a more comprehensive cross-lingual benchmark, which comprises 240 paragraphs and 1190 question-answer pairs from SQuAD v1.1 translated into ten languages by professional translators.Comment: ACL 202
    • 

    corecore