417 research outputs found

    The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants

    Full text link
    We present Belebele, a multiple-choice machine reading comprehension (MRC) dataset spanning 122 language variants. Significantly expanding the language coverage of natural language understanding (NLU) benchmarks, this dataset enables the evaluation of text models in high-, medium-, and low-resource languages. Each question is based on a short passage from the Flores-200 dataset and has four multiple-choice answers. The questions were carefully curated to discriminate between models with different levels of general language comprehension. The English dataset on its own proves difficult enough to challenge state-of-the-art language models. Being fully parallel, this dataset enables direct comparison of model performance across all languages. We use this dataset to evaluate the capabilities of multilingual masked language models (MLMs) and large language models (LLMs). We present extensive results and find that despite significant cross-lingual transfer in English-centric LLMs, much smaller MLMs pretrained on balanced multilingual data still understand far more languages. We also observe that larger vocabulary size and conscious vocabulary construction correlate with better performance on low-resource languages. Overall, Belebele opens up new avenues for evaluating and analyzing the multilingual capabilities of NLP systems.Comment: 27 pages, 13 figure

    Sites of Scottish heritage in translation : representing memory, history and culture for the French speaking visitor

    Get PDF
    Scotland boasts an extensive variety of castles, museums and other heritage sites which attract large numbers of domestic and international visitors each year. Such sites contribute significantly to the circulation of knowledge across linguistic and cultural borders and, in this context, interpretation (in the form of labels, wall panels, audio-guides, etc.) and interlingual translation play an essential role in ensuring that both domestic and international visitors can access and understand the past. This thesis is formed around a multiple case study carried out in six Scottish heritage sites. Focusing specifically on translations from English into French, the primary aim of this research project is to gain a better understanding of how translation choices can influence the experiences of French-speaking visitors and their overall perception of Scottish heritage. A secondary aim of the project is to get a better sense of the considerations which might drive or hinder translation provision for heritage bodies. This thesis thus explores heritage translation from three perspectives: (i) translation as a process, looking at the conditions under which translations are commissioned and produced; (ii) translation as a product, using Halliday’s model of Systemic Functional Linguistics (SFL) to identify translation shifts between source and target texts; and (iii) translation reception to discern whether and how language provision and translation shifts might impact the experience of visitors and their representation of Scottish cultural and historical heritage. Together, these three strands make it possible to pinpoint areas where translation is already well-utilised and well received and those where it can be improved, thus allowing the formulation of recommendations for best practice.Scotland boasts an extensive variety of castles, museums and other heritage sites which attract large numbers of domestic and international visitors each year. Such sites contribute significantly to the circulation of knowledge across linguistic and cultural borders and, in this context, interpretation (in the form of labels, wall panels, audio-guides, etc.) and interlingual translation play an essential role in ensuring that both domestic and international visitors can access and understand the past. This thesis is formed around a multiple case study carried out in six Scottish heritage sites. Focusing specifically on translations from English into French, the primary aim of this research project is to gain a better understanding of how translation choices can influence the experiences of French-speaking visitors and their overall perception of Scottish heritage. A secondary aim of the project is to get a better sense of the considerations which might drive or hinder translation provision for heritage bodies. This thesis thus explores heritage translation from three perspectives: (i) translation as a process, looking at the conditions under which translations are commissioned and produced; (ii) translation as a product, using Halliday’s model of Systemic Functional Linguistics (SFL) to identify translation shifts between source and target texts; and (iii) translation reception to discern whether and how language provision and translation shifts might impact the experience of visitors and their representation of Scottish cultural and historical heritage. Together, these three strands make it possible to pinpoint areas where translation is already well-utilised and well received and those where it can be improved, thus allowing the formulation of recommendations for best practice

    Cultural influences in the theory of mind

    Get PDF

    Investigating the translation of metaphors used in diagnosis and treatment in Chinese medicine classics Neijing and Shanghan Lun

    Get PDF
    The language used in Traditional Chinese Medicine (TCM) depicts a world of human physiology, pathology, diagnosis and treatment, in which metaphors serve as an essential vehicle for readers to understand fundamental but often abstract concepts in TCM. While previous work has investigated strategies for translating the TCM classics, the metaphors used to describe diagnosis and treatment and their English translations are critical in understanding TCM, and require a more systematic exploration. This study investigates the diagnosis- and treatment-related metaphors selected from two TCM classics, Neijing and Shanghan Lun, and their English renditions by translators from different professional backgrounds. The thesis also focuses on the analysis of the effectiveness of different translation strategies in delivering pertinent health-related information conveyed by the metaphors of the original texts. A multidimensional framework that combines a conceptual approach with linguistic and cultural elements was established to capture the complexity of the metaphors, particularly from the perspective of translation. The linguistic metaphors in this study were first identified from a purpose-built corpus using a CMT-based metaphor identification procedure adapted from Steen (2010). Following the conceptual metaphor inference procedure developed by Steen (2011), various conceptual metaphors were inferred from the linguistic metaphors. Corresponding English translations were also collected to investigate which translation strategies have been used and which strategy can most effectively deliver the health-related information conveyed by the metaphors. Four main strategies were employed in the English translations: 1) equivalent mapping, by which the source domain is retained; 2) using a simile to translate a metaphor; 3) direct narrative equivalence, which abandons the metaphor and narrates the medical knowledge directly; and 4) complemented equivalent translation, whereby the metaphor is explained with additional content. From the perspective of conveying health-related knowledge, equivalent mapping was effective for metaphors universally understood by Chinese and English readers. For culturally specific metaphors, especially when the metaphor relates to an important TCM concept, complemented equivalent translation, which can reconfigure the cognitive context for the reader, was most suitable. For metaphors not related to important concepts, direct narrative equivalence was found to be effective

    Addressing the grammar needs of Chinese EAP students: an account of a CALL materials development project

    Get PDF
    This study investigated the grammar needs of Chinese EAP Foundation students and developed electronic self-access grammar materials for them. The research process consisted of three phases. In the first phase, a corpus linguistics based error analysis was conducted, in which 50 student essays were compiled and scrutinized for formal errors. A tagging system was specially devised and employed in the analysis. The EA results, together with an examination of Foundation tutors’ perceptions of error frequency and gravity led me to prioritise article errors for treatment; in the second phase, remedial materials were drafted based on the EA results and insights drawn from my investigations into four research areas (article pedagogy, SLA theory, grammar teaching approaches and CALL methodologies) and existing grammar materials; in the third phase, the materials were refined and evaluated for their effectiveness as a means of improving the Chinese Foundation students’ use of the article. Findings confirm the claim that L2 learner errors are systematic in nature and lend support to the value of Error Analysis. L1 transfer appears to be one of the main contributing factors in L2 errors. The salient errors identified in the Chinese Foundation corpus show that mismanagement of the article system is the most frequent cause of grammatical errors; Foundation tutors, however, perceive article errors to be neither frequent nor serious. An examination of existing materials reveals that the article is given low priority in ELT textbooks and treatments provided in pedagogical grammar books are inappropriate in terms of presentation, language and exercise types. The devised remedial materials employ both consciousness-raising activities and production exercises, using EAP language and authentic learner errors. Preliminary evaluation results suggest that the EA-informed customised materials have the potential to help learners to perform better in proofreading article errors in academic texts

    A Pedagogy of Witnessing: Linguistic and Visual Frames of the Dark Side in the Multimodal Classroom

    Get PDF
    “A Pedagogy of Witnessing: Linguistic and Visual Frames of the Dark Side in the Multimodal Classroom,” focuses on the theoretical and practical benefits of implementing written, oral, and visual testimonies from traumatic history as a tool for teaching the importance of empathetic and ethical composition practices. Specifically, this dissertation provides resource material for a critical pedagogical model that supports “responsible witnessing” through short writing assignments and a final research project that analyze selected narratives, historical accounts, images, and films spanning World War II and the Vietnam War to more recent global events. My hope is that my work will be of interest to teachers of composition and communication and students who wish to bring approaches to understanding and responding to human and nonhuman suffering as well as social injustice into the classroom

    The effectiveness of reciprocal teaching of reading strategies on ESL students' writing enhancement

    Get PDF
    Since not much consideration has been given, by students and English teachers alike, towards the critical role that reading plays in ESL students' writing enhancement, this study investigated the effect of Reciprocal Teaching Strategies inclusive of predicting, questioning, clarifying and summarizing on ESL students' writing improvement (Ghorbani et al., 2013). It also addresses a need to collect the students' and their English teacher's perceptions (Fisher & Frey, 2007; Stricklin, 2011) towards the use of the four strategies in reciprocal teaching strategies. The researcher of this study selected a mixed quantitative/qualitative research design for data analysis. A total of 104 Malaysian secondary students participated in this study. The instruments used to collect data for this study are CLAWS tagger, AntConc. interviews and Nvivo 10. The outcomes of this study, both quantitatively and qualitatively, revealed the positive impact of reciprocal teaching strategies on participants' writing skills after the intervention. This study has made significant contributions by introducing a new method for boosting ESL students' writing skills; utilizing a computer-based tool for analysing students' classroom writing assignments; creating a new style for essay writing; and providing educational insights to the Ministry of Education, which has always been concerned about Malaysian students' English proficiency

    A local model of writing program assessment : fourteen community college faculty define and evaluate writing proficiency.

    Get PDF
    The introduction to this doctoral dissertation is an argument for locating Writing Across the Curriculum programs on the community-college campus for several reasons, among them the proximity of the disciplines on the community college campus, the increasingly underprepared community college student, and movements toward accountability and assessment at the local and state levels. As an example of what a WAC program may accomplish in the area of program assessment, which developed from WAC proper in the last decade of the last century, Chapters One, Two and Three present data I collected from fourteen faculty volunteers who gave up a beautiful Saturday in May of 1995 to read and evaluate a set of randomly selected student essays. Chapter One summarizes faculty responses to a ten-minute freewriting exercise, in which I asked respondents to describe or define proficient writing from the perspectives of their disciplines. In their responses, I locate four global characteristics used by a simple majority of respondents and 21 other characteristics used by at least one respondent. I argue that these characteristics, especially the global ones, constitute our College\u27s local definition of proficiency. I close the chapter pointing out that future WAC workshops could include discussions of global and other characteristics locating them in student work, and discussing how to teach them, both in writing classes and elsewhere. Although the data in Chapter One are incomplete, they provide a starting place for a teacher-researcher who is interested in how colleagues across the campus describe writing. They also prompt questions about whether the respondents know what they are saying when they use terms like style, purpose, grammar, and audience. Do they really look for the characteristics they claimed to look for in their freewritings? Are there other characteristics to be added to the list? Chapters Two and Three report and interpret additional data from the workshop. Each faculty member read and evaluated end of semester ENG 102 papers, rating them NP (nonproficient), P (proficient), or HP (highly proficient). These chapters are based on an unpublished study Dr. Tom Blues created at the University of Kentucky in May of 1993. Blues was ahead of his time by several years. In 1996, the Southern Association of Colleges and Schools (SACS) mandated an exit-exam for all students in ENG 102 and ENG 105 at Jefferson Community College. I show that a qualitative program assessment could complement or eventually replace the quantitative outside evaluation we are now using and conclude that in 1995 faculty in areas other than English often confused terms associated with writing, but generally returned to their freewriting definitions and descriptions throughout their evaluations. Chapter Four summarizes my conclusions and recommendations, discusses the benefits of local, constructivist assessments in a culture that increasingly truncates and supplants genuine, holistic writing and undermines progress (Shafer 242). The chapter ends with practical recommendations mostly for my colleagues in the Writing Program at Jefferson Community College. Where do we go from here? That sort of thing
    • …
    corecore