16 research outputs found

    Review of Corpus-Based Contrastive Studies of English and Chinese

    Get PDF

    Investigating Non-Sentential Utterances in a Spoken Chinese Corpus

    Get PDF
    International audienceThis paper describes a preliminary investigation into Chinese non-sentential utterances (NSUs) in a corpus of spoken Mandarin. It presents, with examples, a corpus-based taxonomy of Chinese NSUs. This taxonomy builds on the one by Fernández and Ginzburg for English NSUs in the British National Corpus (BNC) [1]. Partly due to the distinctiveness of spoken Chinese, eight new classes are added and their reasons for addition are explained. The paper concludes with discussions for future work

    A corpus based study of formulaic language use by native and non-native speakers

    Get PDF
    Although language makes use of formulaic patterns, knowing and using these formulaic patterns of words can prove to be quite difficult for non-native speakers of English. Since knowledge on formulaic language can both improve learners’ comprehension and production of the language, it might be important for learners to familiarize themselves with these formulaic patterns. The aims of this thesis are to analyze whether or not non-native speakers use formulaic language more in their writing or in their speech as well as to compare the formulaic language use between native and non-native spoken language. Therefore a corpus-based analysis was conducted, which utilized the Tartu Corpus of Estonian Learner English (TCELE) for the written corpus, Loucain International Database of Spoken English Interlanguage (LINDSEI-EST) for the non-native spoken corpus, and Michigan Corpus of Academic Spoken English (MICASE) for the native spoken corpus.https://www.ester.ee/record=b5375614*es

    Mind the source data! : Translation equivalents and translation stimuli from parallel corpora

    Get PDF
    Statements like ‘Word X of language A is translated with word Y of language B’ are incorrect, although they are quite common: words cannot be translated, as translation takes place on the level of sentences or higher. A better term for the correspondence between lexical items of source texts and their matches in target texts would be translation equivalence (Teq). In addition to Teq, there exists a reverse relation—translation stimulation (Tst), which is a correspondence between the lexical items of target texts and their matches (=stimuli) in source texts. Translation equivalents and translation stimuli must be studied separately and based on natural direct translations. It is not advisable to use pseudo-parallel texts, i.e. aligned pairs of translations from a ‘hub’ language, because such data do not reflect real translation processes. Both Teq and Tst are lexical functions, and they are not applicable to function words like prepositions, conjunctions, or particles, although it is technically possible to find Teq and Tst candidates for such words as well. The process of choosing function words when translating does not proceed in the same way as choosing lexical units: first, a relevant construction is chosen, and next, it is filled with relevant function words. In this chapter, the difference between Teq and Tst will be shown in examples from Russian–Finnish and Finnish–Russian parallel corpora. The use of Teq and Tst for translation studies and contrastive semantic research will be discussed, along with the importance of paying attention to the nature of the texts when analysing corpus findings.acceptedVersionPeer reviewe

    La référence pronominale en français et en mandarin : une étude contrastive basée sur un corpus de traductions bidirectionnelles

    Get PDF
    Cette étude empirique basée sur corpus vise à examiner de manière quantitative la question de la référence pronominale dans deux langues typologiquement différentes et géoculturellement éloignées : le français et le chinois mandarin. En effectuant des analyses monolingues et interlinguistiques sur un corpus parallèle composé de textes authentiques traduits du français vers le mandarin ou dans la direction inverse, nous mettons en évidence une disparité importante entre ces deux langues dans le choix des moyens référentiels. Conformément à nos attentes, le français s’avère avoir une prédilection pour la pronominalisation, alors que le mandarin est caractérisé par une fréquence importante des reprises nominales et notamment des ellipses. Cette différence de choix référentiel est liée en général aux traits spécifiques du système pronominal de chaque langue et explique pour une grande part le contraste saisissant entre les tendances de traduction du français vers le mandarin et du mandarin vers le français.This paper aims at a quantitative investigation on pronominal references in two typologically and geo-culturally distinct languages: French and Mandarin Chinese. By conducting monolingual and inter-linguistic analyses based on a parallel corpus consisting of authentic texts translated from French into Mandarin or in the opposite direction, we demonstrated that apropos of the choice of referential expressions there is a wide divergence between these two languages. In accordance with our expectations, it seems that French has a predilection for the use of pronominal references, whereas Mandarin is characterized by a high frequency of nominal repetitions and in particular of ellipses. This difference is in general linked to the specific features of the pronominal system in each of the two languages and explains to a large extent the striking contrast between the translation tendencies from French to Mandarin and from Mandarin to French

    How do English translations differ from native English writings?:A multi-feature statistical model for linguistic variation analysis

    Get PDF
    This paper discusses the debatable hypotheses of “Translation Universals”, i. e. the recurring common features of translated texts in relation to original utterances. We propose that, if translational language does have some distinctive linguistic features in contrast to non-translated writings in the same language, those differences should be statistically significant, consistently distributed and systematically co-occurring across registers and genres. Based on the balanced Corpus of Translational English (COTE) and its non-translated English counterpart, the Freiburg-LOB corpus of British English (FLOB), and by deploying a multi-feature statistical analysis on 96 lexical, syntactic and textual features, we try to pinpoint those distinctive features in translated English texts. We also propose that the stylo-statistical model developed in this study will be effective not only in analysing the translational variation of English but also be capable of clustering those variational features into a “translational” dimension which will facilitate a crosslinguistic comparison of translational languages (e. g. translational Chinese) to test the Translation Universals hypotheses

    Teaching and Learning English through Corpus-based Approaches in Norwegian Secondary Schools: Identifying Obstacles and a Way Forward

    Get PDF
    Article 2 has been removed from the digital version, due to copyright issues. It can be read in the printed edition.This doctoral dissertation presents the use of corpus-based approaches to English language learning in upper secondary school in Norway. The research was conducted in two distinct phases. The first phase investigated the pedagogic corpus work of four corpus-trained, in service teachers and their students’ corpus literacy alongside factors that might have influenced this work. Data were collected through a questionnaire to students and teacher interviews. The second phase featured a teacher-researcher collaboration with one of the four aforementioned teachers and two of his first-year, upper-secondary classes to design and implement a corpu-based approach over a two-week period. Data were collected through a case-study design with classroom observations, and subsequent student group interviews. Previous studies have shown that corpus-based approaches to language learning result in positive learning outcomes; however, most studies are at the tertiary level and designed and conducted by corpus scholars. Meanwhile, data-driven learning in secondary school is “relatively uncharted territory” (Wicher, 2020) and there has been a call for more qualitative studies. The current dissertation sought to contribute knowledge of data-driven learning in the secondary-school context and insight into the processes and opinions of teachers and learners related to pedagogic corpus use. In the first phase, it was found that the teachers, despite their formal corpus training, had avoided corpus-based approaches in their practice, and few of their students knew anything about corpora. Factors such as teachers’ beliefs about their students’ digital and linguistic competence and about corpora, teachers’ topic focus and epistemic beliefs, and the inaccessibility and cost of corpus applications contributed to their reluctance to introducing their students to corpora. In the second phase, several opportunities for learning were found including instances of metatalk to describe corpus data, peer scaffolding where students helped each other to learn the tool, and teacher scaffolding where the teacher confronted the students with their socio-economic prejudices that arose while working with corpus data from Irish English speakers. However, students’ impressions of the tool and process were negatively skewed. Their critique focused on the absence of the teacher, the complexity and aesthetics of the corpus tool’s interface and data, and the tool’s irrelevance to their learning process. In addition to the empirical contributions described above, it is argued in the dissertation that there are two major obstacles to data-driven learning that need to be addressed in order for its ii application to be normalized in the classroom. These obstacles concern a) the novelty of the approach and the training and mediation required to overcome this novelty, and b) the relevance of the approach to teachers, students, and the curriculum. Inquiry-based education was brought in as a theoretical framework that has considerable overlap with the concepts of data-driven learning but includes a more pronounced social dimension that foster teacher and peer mediation, collaborative learning, and knowledge sharing.Sammendrag: Denne doktorgradsavhandlingen tar for seg en studie av korpusbaserte tilnærminger til engelsk språklæring i norsk videregående skole. Forskningen foregikk i to separate faser. I den første fasen ble den pedagogiske korpusbruken til fire videregåendelærere undersøkt, samt deres elevers korpuskjennskap. I tillegg ble faktorer som kan ha påvirket korpusbruken i disse klassene undersøkt. Dataen ble samlet via et spørreskjema til elevene og lærerintervjuer. Den andre fasen innebar et lærer-forsker-samarbeid med en av de ovennevnte lærerne og to av hans første års videregåendeklasser, for å designe og gjennomføre en korpusbasert tilnærming over en to-ukers periode. Dataen ble samlet gjennom en kasusstudie med klasseromsobservasjoner og påfølgende gruppeintervjuer med elevene. Tidligere forskning viser at korpusbaserte tilnærminger til språklæring har resultert i positivt læringsutbytte, men de fleste studiene er gjort innen høyere utdanning og er designet og gjennomført av korpusforskere. Datadrevet læring i videregående er derimot relativt lite utforsket (Wicher, 2020) og flere kvalitative studier etterspørres av flere forskere. Denne avhandlingen bidrar til mer kunnskap om datadrevet læring i videregåendekonteksten og gir innsikt i prosessene knyttet til pedagogisk bruk av korpus og meningene lærere og elever har i denne sammenhengen. I den første fasen fant jeg at lærere, til tross for deres formelle korpusutdanning, hadde unngått korpusbaserte tilnærminger i sin egen praksis, og få blant elevene deres visste noe som helst om korpus. Faktorer som læreres oppfatninger [teacher’s beliefs] om elevenes digitale og språklige ferdigheter og korpusferdigheter, lærernes temafokus og epistemiske holdninger, og problemer relatert til tilgjengelighet og kostnader, bidro til lærernes motvilje mot å introdusere korpus til elevene. I den andre fasen ble flere språklæringsmuligheter observert, inkludert tilfeller av bruk av metaspråk for å beskrive korpusdataene, elev-scaffolding hvor elevene hjalp hverandre med å forstå verktøyet, og lærer-scaffolding ved at læreren konfronterte elevene med deres sosioøkonomiske fordommer som kom frem mens de arbeidet med korpusdata fra irsk-engelske språkbrukere. Elevene uttrykte likevel negative oppfatninger av korpusverktøyet og undervisningen. Elevenes kritikk omhandlet opplevelsen av læreren som fraværende, kompleksiteten og estetikken til korpusverktøyet og dataene i korpuset, samt verktøyets manglende relevans for læringsprosessen deres. I tillegg til de ovenfornevnte empiriske bidragene argumenteres det i avhandlingen for at datadrevet læring innebærer to store utfordringer som må løses for at tilnærmingen skal bli normalisert i klasserommet. Disse utfordringene omhandler nyhetsproblemet [the novelty gap] og den treningen og medieringen som kreves for å løse det, samt korpustilnærmingens relevans for lærere, elever og læreplanen. Utforskende arbeidsmetoder [inquiry-based education] er foreslått som et teoretisk rammeverk som i stor grad overlapper med datadrevet læring, men som også inkluderer en tydeligere sosial dimensjon som innebærer lærer- og elev-mediering, samarbeidende læringsformer, og kunnskapsdeling.publishedVersio
    corecore