25,023 research outputs found

    A corpus-driven study of features of Chinese students' undergraduate writing in UK universities

    Get PDF
    Chinese people now comprise the ‘largest single overseas student group in the UK’ with more than 85,000 Chinese students registered at UK institutions in 2009 (British Council, 2010a). While there have been many studies carried out on short argumentative essays from this group (e.g. Chen, 2009), and on postgraduate theses (e.g. Hyland, 2008b), there has been comparatively little research conducted on the high-stakes genre of undergraduate assignments. This study examines assessed writing from Chinese and British undergraduates studying in UK universities between 2000 and 2008; these are investigated using corpus linguistic procedures, supported by qualitative reading. A particular focus is the use of lexical chunks, or recurring strings of words. Findings from the literature on Chinese students’ written English indicate high use of informal chunks, connecting chunks, and those containing first person pronouns (e.g. Milton, 1999). This study found that while the Chinese students make greater use of particular connectors and the first person plural, both student groups make (limited) use of informal language. These areas of difference are more apparent in year 1/2 assignments than those from year 3, suggesting that students gradually conform to the academy’s expectations. Unexpected findings which have not been previously identified in the literature include Chinese students’ significantly higher use of tables, figures (or ‘visuals’) and lists, compared to the British students’ writing. Detailed exploration of writing within Biology, Economics and Engineering suggests that using visuals and lists are different, yet equally acceptable, ways of writing assignments. Since the writing of both student groups has been judged by discipline specialists to be of a high standard, it is argued that the difference in use of visuals and lists illustrates the range of acceptability at undergraduate level. The thesis proposes that scholars therefore need to consider expanding the notion of what constitutes ‘good’ student writing

    Phraseology in Corpus-based transaltion studies : stylistic study of two contempoarary Chinese translation of Cervantes's Don Quijote

    No full text
    The present work sets out to investigate the stylistic profiles of two modern Chinese versions of Cervantes???s Don Quijote (I): by Yang Jiang (1978), the first direct translation from Castilian to Chinese, and by Liu Jingsheng (1995), which is one of the most commercially successful versions of the Castilian literary classic. This thesis focuses on a detailed linguistic analysis carried out with the help of the latest textual analytical tools, natural language processing applications and statistical packages. The type of linguistic phenomenon singled out for study is four-character expressions (FCEXs), which are a very typical category of Chinese phraseology. The work opens with the creation of a descriptive framework for the annotation of linguistic data extracted from the parallel corpus of Don Quijote. Subsequently, the classified and extracted data are put through several statistical tests. The results of these tests prove to be very revealing regarding the different use of FCEXs in the two Chinese translations. The computational modelling of the linguistic data would seem to indicate that among other findings, while Liu???s use of archaic idioms has followed the general patterns of the original and also of Yang???s work in the first half of Don Quijote I, noticeable variations begin to emerge in the second half of Liu???s more recent version. Such an idiosyncratic use of archaisms by Liu, which may be defined as style shifting or style variation, is then analyzed in quantitative terms through the application of the proposed context-motivated theory (CMT). The results of applying the CMT-derived statistical models show that the detected stylistic variation may well point to the internal consistency of the translator in rendering the second half of Part I of the novel, which reflects his freer, more creative and experimental style of translation. Through the introduction and testing of quantitative research methods adapted from corpus linguistics and textual statistics, this thesis has made a major contribution to methodological innovation in the study of style within the context of corpus-based translation studies.Imperial Users onl

    Phraseology in Corpus-Based Translation Studies: A Stylistic Study of Two Contemporary Chinese Translations of Cervantes's Don Quijote

    No full text
    The present work sets out to investigate the stylistic profiles of two modern Chinese versions of Cervantes’s Don Quijote (I): by Yang Jiang (1978), the first direct translation from Castilian to Chinese, and by Liu Jingsheng (1995), which is one of the most commercially successful versions of the Castilian literary classic. This thesis focuses on a detailed linguistic analysis carried out with the help of the latest textual analytical tools, natural language processing applications and statistical packages. The type of linguistic phenomenon singled out for study is four-character expressions (FCEXs), which are a very typical category of Chinese phraseology. The work opens with the creation of a descriptive framework for the annotation of linguistic data extracted from the parallel corpus of Don Quijote. Subsequently, the classified and extracted data are put through several statistical tests. The results of these tests prove to be very revealing regarding the different use of FCEXs in the two Chinese translations. The computational modelling of the linguistic data would seem to indicate that among other findings, while Liu’s use of archaic idioms has followed the general patterns of the original and also of Yang’s work in the first half of Don Quijote I, noticeable variations begin to emerge in the second half of Liu’s more recent version. Such an idiosyncratic use of archaisms by Liu, which may be defined as style shifting or style variation, is then analyzed in quantitative terms through the application of the proposed context-motivated theory (CMT). The results of applying the CMT-derived statistical models show that the detected stylistic variation may well point to the internal consistency of the translator in rendering the second half of Part I of the novel, which reflects his freer, more creative and experimental style of translation. Through the introduction and testing of quantitative research methods adapted from corpus linguistics and textual statistics, this thesis has made a major contribution to methodological innovation in the study of style within the context of corpus-based translation studies

    A Rule-based Methodology and Feature-based Methodology for Effect Relation Extraction in Chinese Unstructured Text

    Get PDF
    The Chinese language differs significantly from English, both in lexical representation and grammatical structure. These differences lead to problems in the Chinese NLP, such as word segmentation and flexible syntactic structure. Many conventional methods and approaches in Natural Language Processing (NLP) based on English text are shown to be ineffective when attending to these language specific problems in late-started Chinese NLP. Relation Extraction is an area under NLP, looking to identify semantic relationships between entities in the text. The term “Effect Relation” is introduced in this research to refer to a specific content type of relationship between two entities, where one entity has a certain “effect” on the other entity. In this research project, a case study on Chinese text from Traditional Chinese Medicine (TCM) journal publications is built, to closely examine the forms of Effect Relation in this text domain. This case study targets the effect of a prescription or herb, in treatment of a disease, symptom or body part. A rule-based methodology is introduced in this thesis. It utilises predetermined rules and templates, derived from the characteristics and pattern observed in the dataset. This methodology achieves the F-score of 0.85 in its Named Entity Recognition (NER) module; 0.79 in its Semantic Relationship Extraction (SRE) module; and the overall performance of 0.46. A second methodology taking a feature-based approach is also introduced in this thesis. It views the RE task as a classification problem and utilises mathematical classification model and features consisting of contextual information and rules. It achieves the F-scores of: 0.73 (NER), 0.88 (SRE) and overall performance of 0.41. The role of functional words in the contemporary Chinese language and in relation to the ERs in this research is explored. Functional words have been found to be effective in detecting the complex structure ER entities as rules in the rule-based methodology

    Англійська мова для навчання і роботи. Т. 3. Дискусії та презентації

    Get PDF
    Розглянуто всі види діяльності студентів з вивчення англійської мови, спрямовані на розвиток мовної поведінки, необхідної для ефективного спілкування в академічному та професійному середовищах. Містить завдання і вправи, типові для різноманітних академічних та професійних сфер і ситуацій. Структура організації змісту – модульна, охоплює певні мовленнєві вміння залежно від мовної поведінки. Даний модуль має на меті розвиток у студентів умінь і навичок академічного і професійно-орієнтованого мовлення, необхідних для участі в дискусіях, семінарах, конференціях та при підготовці й проведенні презентацій (виступів-доповідей). Зразки текстів – автентичні, містять цікаву та актуальну інформацію із загальнонаукової та професійної тематики. Ресурси для самостійної роботи (частина ІІ) включають завдання та вправи для розвитку словникового запасу та розширення діапазону функціональних зразків, необхідних для виконання певних функцій, та завдання, які спрямовані на організацію самостійної роботи студентів. За допомогою засобів діагностики (частина ІІІ) студенти можуть самостійно перевірити засвоєння навчального матеріалу та оцінити свої досягнення. Граматичні явища і вправи для їх засвоєння наводяться в томі 5. Призначений для студентів технічних університетів гірничого профілю. Може використовуватися для викладання вибіркових курсів англійської мови, а також у самостійному вивченні англійської мови викладачами, фахівцями і науковцями різних інженерних галузей

    Misalignment of Learning Contexts - an explanation of the Chinese Learner Paradox.

    Get PDF
    There is considerable research evidence (e.g. Biggs 1991, Watkins et al 1990, Kember et al 1991) to suggest that East Asian learners exhibit superior learning styles and academic performance to their western counterparts at secondary and tertiary levels. This is a surprising outcome given the less favourable educational environment of most East Asian societies (such as large class size, expository teaching methodology, highly competitive exam system and exam-oriented curriculum) which, according to educational literature, is more conducive to surface learning and atomistic learning outcome. This seemingly contradictory situation, known as the Chinese Learner Paradox (Marton et al, 1993), has been the subject of quantitative and qualitative educational researches since the late 1980s. However, existing research has tended not to examine the impacts of different assessment regimes (i.e. exam essay, short answer question, MCQ test, term essay, reflective journal, practicum etc) on the learning process. More specifically, they did not investigate the interaction of learning approaches with assessment types in influencing learning outcomes in cross-cultural studies. In this study intensive semi-structured interviews were conducted with 10 tertiary students, consisting of 5 East Asian and 5 local Australian students in Brisbane, the overriding aim being to investigate their ideas of learning, and their approaches to learning for written assignments and for exams, to establish whether cultural difference is a determining influence on the learning process. Preliminary results suggest a different way of interpreting and explaining the paradox.

    Treebank-based acquisition of Chinese LFG resources for parsing and generation

    Get PDF
    This thesis describes a treebank-based approach to automatically acquire robust,wide-coverage Lexical-Functional Grammar (LFG) resources for Chinese parsing and generation, which is part of a larger project on the rapid construction of deep, large-scale, constraint-based, multilingual grammatical resources. I present an application-oriented LFG analysis for Chinese core linguistic phenomena and (in cooperation with PARC) develop a gold-standard dependency-bank of Chinese f-structures for evaluation. Based on the Penn Chinese Treebank, I design and implement two architectures for inducing Chinese LFG resources, one annotation-based and the other dependency conversion-based. I then apply the f-structure acquisition algorithm together with external, state-of-the-art parsers to parsing new text into "proto" f-structures. In order to convert "proto" f-structures into "proper" f-structures or deep dependencies, I present a novel Non-Local Dependency (NLD) recovery algorithm using subcategorisation frames and f-structure paths linking antecedents and traces in NLDs extracted from the automatically-built LFG f-structure treebank. Based on the grammars extracted from the f-structure annotated treebank, I develop a PCFG-based chart generator and a new n-gram based pure dependency generator to realise Chinese sentences from LFG f-structures. The work reported in this thesis is the first effort to scale treebank-based, probabilistic Chinese LFG resources from proof-of-concept research to unrestricted, real text. Although this thesis concentrates on Chinese and LFG, many of the methodologies, e.g. the acquisition of predicate-argument structures, NLD resolution and the PCFG- and dependency n-gram-based generation models, are largely language and formalism independent and should generalise to diverse languages as well as to labelled bilexical dependency representations other than LFG

    Description of the Chinese-to-Spanish rule-based machine translation system developed with a hybrid combination of human annotation and statistical techniques

    Get PDF
    Two of the most popular Machine Translation (MT) paradigms are rule based (RBMT) and corpus based, which include the statistical systems (SMT). When scarce parallel corpus is available, RBMT becomes particularly attractive. This is the case of the Chinese--Spanish language pair. This article presents the first RBMT system for Chinese to Spanish. We describe a hybrid method for constructing this system taking advantage of available resources such as parallel corpora that are used to extract dictionaries and lexical and structural transfer rules. The final system is freely available online and open source. Although performance lags behind standard SMT systems for an in-domain test set, the results show that the RBMT’s coverage is competitive and it outperforms the SMT system in an out-of-domain test set. This RBMT system is available to the general public, it can be further enhanced, and it opens up the possibility of creating future hybrid MT systems.Peer ReviewedPostprint (author's final draft
    corecore