926 research outputs found

    A Key Word Analysis of English Intensifying Adverbs in Male and Female Speech in ICE-GB

    Get PDF
    PACLIC 23 / City University of Hong Kong / 3-5 December 200

    Getting rid of the Chi-square and Log-likelihood tests for analysing vocabulary differences between corpora

    Get PDF
    Log-likelihood and Chi-square tests are probably the most popular statistical tests used in corpus linguistics, especially when the research is aiming to describe the lexical variations between corpora. However, because this specific use of the Chi-square test is not valid, it produces far too many significant results. This paper explains the source of the problem (i.e., the non-independence of the observations), the reasons for which the usual solutions are not acceptable and which kinds of statistical test should be used instead. A corpus analysis conducted on the lexical differences between American and British English is then reported, in order to demonstrate the problem and to confirm the adequacy of the proposed solution. The last section presents the commands that can be used with WordSmith Tools, a very popular software for corpus processing, to obtain the necessary data for the adequate tests, as well as a very easy-to-use procedure in R, a free and easy to install statistical software, that performs these tests

    The Counts of Dracula and Monte Cristo: Homonym Frequencies in Graded Readers

    Get PDF
    Graded readers are a great asset to learners acquiring the vocabulary of another language. Homonyms, on the other hand, are a recognized source of trouble for students with that same goal. Publishers of graded readers control the presentation of old and new words, but does this control extend to homonyms? Are only the word forms controlled for—in which case, the unrelated meanings of match (a pairing and a stick for starting fire) would together constitute two uses of the word? Or would these tally as separate words which, semantically and etymologically, they are? A comparison of a 4.2 million-word corpus of graded readers with previous research on the distributions of homonymic meanings in general English reveals that the meanings presented to learners are frequently quite different to those in general-purpose texts

    Drawing Elena Ferrante's Profile. Workshop Proceedings, Padova, 7 September 2017

    Get PDF
    Elena Ferrante is an internationally acclaimed Italian novelist whose real identity has been kept secret by E/O publishing house for more than 25 years. Owing to her popularity, major Italian and foreign newspapers have long tried to discover her real identity. However, only a few attempts have been made to foster a scientific debate on her work. In 2016, Arjuna Tuzzi and Michele Cortelazzo led an Italian research team that conducted a preliminary study and collected a well-founded, large corpus of Italian novels comprising 150 works published in the last 30 years by 40 different authors. Moreover, they shared their data with a select group of international experts on authorship attribution, profiling, and analysis of textual data: Maciej Eder and Jan Rybicki (Poland), Patrick Juola (United States), Vittorio Loreto and his research team, Margherita Lalli and Francesca Tria (Italy), George Mikros (Greece), Pierre Ratinaud (France), and Jacques Savoy (Switzerland). The chapters of this volume report the results of this endeavour that were first presented during the international workshop Drawing Elena Ferrante's Profile in Padua on 7 September 2017 as part of the 3rd IQLA-GIAT Summer School in Quantitative Analysis of Textual Data. The fascinating research findings suggest that Elena Ferrante\u2019s work definitely deserves \u201cmany hands\u201d as well as an extensive effort to understand her distinct writing style and the reasons for her worldwide success

    What effect does short term Study Abroad (SA) have on learners’ vocabulary knowledge?

    Get PDF
    This thesis describes a study which tracks longitudinal changes in vocabularyknowledge during a short-term Study Abroad (SA) experience. A test ofproductive vocabulary knowledge, Lex30 (Meara & Fitzpatrick, 2000),requiring the production of word association responses, is used to elicit vocabulary from 38 Japanese L1 learners of English at four test times at equal intervals before and after an SA experience. The study starts by investigating whether there are changes in both the total number of words and in the number of less frequently occurring words produced by SA participants. Three additional ways of measuring the development of lexical knowledge over time are then proposed. The first examines changes in the ability of participants of different proficiency levels in producing collocates in response to Lex30 cue words. The second tracks changes in spelling accuracy to measure if improvements take place over time. The third analysis uses an online measuring instrument (Wmatrix; Rayson, 2009) to explore if there are any changes in the mastery of specific semantic domains. The results show that there is significant growth in the productive use of less frequent vocabulary knowledge during the SA period. There is also an increase in collocation production with lower proficiency participants and evidence of some improvement in the way certain vocabulary items are spelled. The tendency for SA learners to produce more words from semantic groups related to SA experiences is also demonstrated. Post-SA tests show that while some knowledge attrition occurs it does not decline to pre-SA levels. The studyshows how short-term SA programmes can be evaluated using a word association test, contributing to a better understanding of how vocabularydevelops during intensive language learning experiences. It also demonstrates the gradual shift of productive vocabulary knowledge from partial word knowledge to a more complete state of productive mastery

    Gradient Metaphoricity of the Preposition in: A Corpus-based Approach to Chinese Academic Writing in English

    Get PDF
    In Cognitive Linguistics, a conceptual metaphor is a systematic set of correspondences between two domains of experience (Kövecses 2020: 2). In order to have an extensive understanding of metaphors, metaphoricity (MĂŒller and Tag 2010; Dunn 2011; Jensen and Cuffari 2014; Nacey and Jensen 2017) has been emphasized to address one of the properties of metaphors in language usage: gradience (Hanks 2006; Dunn 2011, 2014), which indicates that metaphorical expressions can be measured. Despite many noteworthy contributions, studies of metaphoricity are often accused of subjectivity (MĂŒller 2008; Jensen and Cuffari 2014; Jensen 2017), this is why this study uses a big corpus as a database. Therefore, the main aim of this dissertation is to measure the gradient senses of the preposition in in an objective way, thus mapping the highly systematic semantic extension. Based on these gradient senses, the semantic and syntactic features of the preposition in produced by advanced Chinese English-major learners are investigated, combining quantitative and qualitative research methods. A quantitative analysis of the literal and other ten metaphorical senses of the preposition in is made at first. In accounting for the five factors influencing image schemata of each sense: “scale of Landmark”, “visibility”, “path”, “inclusion” and “boundary”, the formula of measuring the gradability of metaphorical degree is deduced: Metaphoricity=[[#Visibility] +[#Path] +[#Inclusion] +[#Boundary]]*[#Scale of Landmark]. The result is that the primary sense has the highest value:12, and all other extended senses have values down to zero. The more shared features with proto-scene, the higher the value of the metaphorical sense, and the less metaphorical the sense. EVENT and PERSON are the “least metaphoric” (value = 9-11); SITUATION, NUMBER, CONTENT and FIELD are “weak metaphoric” (value = 6-8); Also included are SEGMENTATION, TIME and MANNER (value = 3-5), and they are “strong metaphoric”; PURPOSE shares the least feature with proto-scene, and it has the lowest value, so it is “most metaphoric” (value = 0-2). Then, a corpus-based approach is employed, which offers a model for employing a corpus-based approach in Cognitive Linguistics. It compares two compiled sub-corpora: Chinese Master Academic Writing Corpus and Chinese Doctorate Academic Writing Corpus. The findings show that, on the semantic level, Chinese English-major students overuse in with a low level of metaphoricity, even advanced learners use the most metaphorical in rarely. In terms of syntactic behaviours, the most frequent nouns in [in+noun] construction are weakly metaphoric, whilst the nouns in the construction [in the noun of] are EVENT sense, which is least metaphorical. Moreover, action verbs tend to be used in the construction [verb+in] and [in doing sth.] in both master and doctorate groups. In the qualitative study, the divergent usages of the preposition in are explored. The preposition in is often substituted with other prepositions, such as on and at. The fundamental reason for the Chinese learners’ weakness is the negative transfer from their mother tongue (Wang 2001; Gong 2007; Zhang 2010). Although in and its Chinese equivalence zai...li (朹...里) share the same proto-scene, there are discrepancies: the metaphorical senses of the preposition in are TIME, PURPOSE, NUMBER, CONTENT, FIELD, EVENT, SITUATION, SEGMENTATION, MANNER, PERSON, while those of zai...li (朹...里) are only five: TIME, CONTENT, EVENT, SITUATION and PERSON. Thus the image schemata of each sense cannot be correspondingly mapped onto each other in different languages. This study also provides evidence for the universality and variation of spatial metaphors on the ground of cultural models. Philosophically, it supports the standpoint of Embodiment philosophy that abstract concepts are constructed on the basis of spatial metaphors that are grounded in the physical and cultural experience

    Democratization of Englishes : Synchronic and diachronic approaches

    Get PDF
    The term democratization has been used in recent linguistic research to describe how specific linguistic changes can be linked to changes in sociocultural norms. This broad definition, however, does not fully capture the essence of this phenomenon or explain how it differs from other processes of language change. Other key issues in this area of research include what the cause-effect relationship is between linguistic change and social change, and how empirical corpus linguistic studies can contribute to current knowledge. In this opening contribution to the special issue New perspectives on democratization: Evidence from English(es), we address some of these key issues by reviewing previous synchronic and diachronic work studies on democratization in different varieties of English, and introduce new studies that take evidence from different linguistic corpora. By placing the linguistic changes into their specific socio-historical contexts, these studies yield interesting results, showing that variationist linguistic methodology may significantly contribute to disentangling the complex relationship between language change and social and societal changes. (C) 2020 The Author(s). Published by Elsevier Ltd.Peer reviewe

    A Study of Issues and Techniques for Creating Core Vocabulary Lists for English as an International Language

    No full text
    Core vocabulary lists have long been a tool used by language learners and instructors seeking to facilitate the initial stages of foreign language learning (Fries & Traver, 1960: 2). In the past, these lists were typically based on the intuitions of experienced educators. Even before the advent of computer technology in the mid-twentieth century, attempts were made to create such lists using objective methodologies. These efforts regularly fell short, however, and – in the end – had to be tweaked subjectively. Now, in the 21st century, this is unfortunately still true, at least for those lists whose methodologies have been published. Given the present availability of sizable English-language corpora from around the world and affordable personal computers, this thesis seeks to fill this methodological gap by answering the research question: How can valid core vocabulary lists for English as an International Language be created? A practical taxonomy is proposed based on Biber’s (1988, 1995) multi-dimensional analysis of English texts. This taxonomy is based on correlated linguistic features and reasonably covers representative spoken and written texts in English. The four-part main study assesses the variance in vocabulary data within each of the four key text types: interactive (face-to-face conversation), academic exposition, imaginative narrative, and general reported exposition. The variation in word types found at progressive intervals in corpora of various sizes is measured using the Dice coefficient, a coefficient originally used to measure species variation in different biotic regions (Dice, 1945). The second study proceeds to compare the most frequent vocabulary types in each of the four text types using an equal-sized collection of each text type. Of special interest is the difference between spoken and written texts. Though types are arguably the proper unit to investigate when comparing vocabulary variation, few learners would want to approach vocabulary learning one word type at a time (Nation & Meara, 2002; Bauer & Nation, 1993). The third study thus compares the effect reordering words as families (as opposed to types) has on core vocabulary lists. An analysis is made of the major differences resulting from grouping the members of each word family under a single headword and summing their individual frequencies. Methods are then discussed for how core vocabulary lists of various sizes can be constructed based on the findings of these three studies. Recommendations are made regarding the size and composition of the source corpus and the core list extraction and construction methodology based on the learning objectives
    • 

    corecore