585 research outputs found

    Computer Assisted Language Learning Based on Corpora and Natural Language Processing : The Experience of Project CANDLE

    Get PDF
    This paper describes Project CANDLE, an ongoing 3-year project which uses various corpora and NLP technologies to construct an online English learning environment for learners in Taiwan. This report focuses on the interim results obtained in the first eighteen months. First, an English-Chinese parallel corpus, Sinorama, was used as the main course material for reading, writing, and culture-based learning courses. Second, an online bilingual concordancer, TotalRecall, and a collocation reference tool, TANGO, were developed based on Sinorama and other corpora. Third, many online lessons, including extensive reading, verb-noun collocations, and vocabulary, were designed to be used alone or together with TotalRecall and TANGO. Fourth, an online collocation check program, MUST, was developed for detecting V-N miscollocation and suggesting adequate collocates in student’s writings based on the hypothesis of L1 interference and the database of BNC and the bilingual Sinorama Corpus. Other computational scaffoldings are under development. It is hoped that this project will help intermediate learners in Taiwan enhance their English proficiency with effective pedagogical approaches and versatile language reference tools

    Predicting ESL learners’ oral proficiency by measuring the collocations in their spontaneous speech

    Get PDF
    Collocation, known as words that commonly co-occur, is a major category of formulaic language. There is now general consensus among language researchers that collocation is essential to effective language use in real-world communication situations (Ellis, 2008; Nesselhauf, 2005; Schmitt, 2010; Wray, 2002). Although a number of contemporary speech-processing theories assume the importance of formulaic language to spontaneous speaking (Bygate, 1987; de Bot, 1992; Kormos, 2006; Levelt, 1999), none of them gives an adequate explanation of the role that collocation plays in speech communication. In the practices of L2 speaking assessment, a test taker’s collocational performance is usually not separately scored mainly because human raters can only focus on a limited range of speech characteristics (Luoma, 2004). This paper argues for the centrality of collocation evaluation to communication-oriented L2 oral assessment. Based on a logical analysis of the conceptual connections among collocation, speech-processing theories, and rubrics for oral language assessment, the author formulated a new construct called Spoken Collocational Competence (SCC). In light of Skehan’s (1998, 2009) trade-off hypothesis, he developed a series of measures for SCC, namely Operational Collocational Performance Measures (OCPMs), to cover three dimensions of learner collocation performance in spontaneous speaking: collocation accuracy, collocation complexity, and collocation fluency. He then investigated the empirical performance of these measures with 2344 lexical collocations extracted from sixty adult English as a second language (ESL) learners’ oral assessment data collected in two distinctive contexts of language use: conversing with an interlocutor on daily-life topics (or the SPEAK exam) and giving an academic lecture (or the TEACH exam). Multiple regression and logistic regression were performed on criterion measures of these learners’ oral proficiency (i.e., human holistic scores and oral proficiency certification decisions) as a function of the OCPMs. The study found that the participants generally achieved higher collocation accuracy and complexity in the TEACH exam than in the SPEAK exam. In addition, the OCPMs as a whole predicted the participants’ oral proficiency certification status (certified or uncertified) with high accuracy (Negelkerke R2 = .968). However, the predictive power of OCPMs for human holistic scores seemed to be higher in the SPEAK exam (adjusted R2 = .678) than in the TEACH exam (adjusted R2 = .573). These findings suggest that L2 learners’ collocational performance in free speech deserve examiners’ closer attention and that SCC may contribute to the construct of oral proficiency somewhat differently across speaking contexts. Implications for L2 speaking theory, automated speech evaluation, and teaching and learning of oral communication skills are discussed

    El papel de la colocaciĂłn en el artĂ­culo cientĂ­fico

    Get PDF
    A collocation is a combination of two or more words which frequently occur together (McCarthy and O’Dell 2008). In this paper, we will show how the mastery of these word combinations is a must if one is to publish in a second language at an international level. As English has become the language of science (Crystal 2005; Montgomery 2009), and over 80% of scientific publication is done through this language, researchers who have a good command of it should aim at publishing in English. The work presented here aims at providing some tips on how non-native researchers who want to publish RAs at an international level can improve their collocational competence in English.Una colocaciĂłn es una combinaciĂłn de dos o mĂĄs palabras que frecuentemente aparecen juntas (McCarthy y O’Dell 2008). En este artĂ­culo, mostraremos cĂłmo el dominio de estas combinaciones de palabras es imprescindible para publicar a un nivel internacional en una segunda lengua. Puesto que el inglĂ©s se ha convertido en el lenguaje de la ciencia (Crystal 2005; Montgomery 2009), y mĂĄs del 80% de las publicaciones cientĂ­ficas se hace a travĂ©s de este idioma, los investigadores que tienen un buen dominio de Ă©ste deberĂ­an proponerse publicar en inglĂ©s. El trabajo que presentamos pretende dar algunos consejos sobre cĂłmo los investigadores no nativos que quieren publicar artĂ­culos cientĂ­ficos a un nivel internacional pueden mejorar su competencia colocacional en inglĂ©s

    Translating English verbal collocations into Spanish: On distribution and other relevant differences related to diatopic variation

    Get PDF
    Language varieties should be taken into account in order to enhance fluency and naturalness of translated texts. In this paper we will examine the collocational verbal range for prima-facie translation equivalents of words like decision and dilemma, which in both languages denote the act or process of reaching a resolution after consideration, resolving a question or deciding something. We will be mainly concerned with diatopic variation in Spanish. To this end, we set out to develop a giga-token corpus-based protocol which includes a detailed and reproducible methodology sufficient to detect collocational peculiarities of transnational languages. To our knowledge, this is one of the first observational studies of this kind. The paper is organised as follows. Section 1 introduces some basic issues about the translation of collocations against the background of languages’ anisomorphism. Section 2 provides a feature characterisation of collocations. Section 3 deals with the choice of corpora, corpus tools, nodes and patterns. Section 4 covers the automatic retrieval of the selected verb + noun (object) collocations in general Spanish and the co-existing national varieties. Special attention is paid to comparative results in terms of similarities and mismatches. Section 5 presents conclusions and outlines avenues of further research.Published versio

    Theories and methods

    Get PDF
    The notion of formulaicity has received increasing attention in disciplines and areas as diverse as linguistics, literary studies, art theory and art history. In recent years, linguistic studies of formulaicity have been flourishing and the very notion of formulaicity has been approached from various methodological and theoretical perspectives and with various purposes in mind. The linguistic approach to formulaicity is still in a state of rapid development and the objective of the current volume is to present the current explorations in the field. Papers collected in the volume make numerous suggestions for further development of the field and they are arranged into three complementary parts. The first part, with three chapters, presents new theoretical and methodological insights as well as their practical application in the development of custom-designed software tools for identification and exploration of formulaic language in texts. Two papers in the second part explore formulaic language in the context of language learning. Finally, the third part, with three chapters, showcases descriptive research on formulaic language conducted primarily from the perspectives of corpus linguistics and translation studies. The volume will be of interest to anyone involved in the study of formulaic language either from a theoretical or a practical perspective

    Assessing English language learners' collocation knowledge:A systematic review of receptive and productive measurements

    Get PDF
    Since collocation knowledge is integral to second language vocabulary depth, it necessitates a careful examination of various measurement approaches. To this end, the current paper provides an overview and evaluation of extant collocation measurements used in empirical studies on L2 English (N = 153) published between 1980 and 2023 indexed in the SSCI, SCIE, AHCI, SCOPUS, and ERIC databases. Six instruments, seven item formats, and three other assessment tools were identified and reviewed for the assessment of receptive and productive collocation knowledge. The review focused on the collocation knowledge measured by each tool, the instrument and/or item format employed, item design, reported reliability, and potential drawbacks of employing each instrument and item format in research or practice. The review proposes several theoretical and practical considerations for future assessments of and research on English collocation knowledge.</p

    Formulaic language

    Get PDF
    The notion of formulaicity has received increasing attention in disciplines and areas as diverse as linguistics, literary studies, art theory and art history. In recent years, linguistic studies of formulaicity have been flourishing and the very notion of formulaicity has been approached from various methodological and theoretical perspectives and with various purposes in mind. The linguistic approach to formulaicity is still in a state of rapid development and the objective of the current volume is to present the current explorations in the field. Papers collected in the volume make numerous suggestions for further development of the field and they are arranged into three complementary parts. The first part, with three chapters, presents new theoretical and methodological insights as well as their practical application in the development of custom-designed software tools for identification and exploration of formulaic language in texts. Two papers in the second part explore formulaic language in the context of language learning. Finally, the third part, with three chapters, showcases descriptive research on formulaic language conducted primarily from the perspectives of corpus linguistics and translation studies. The volume will be of interest to anyone involved in the study of formulaic language either from a theoretical or a practical perspective

    METRICC: Harnessing Comparable Corpora for Multilingual Lexicon Development

    Get PDF
    International audienceResearch on comparable corpora has grown in recent years bringing about the possibility of developing multilingual lexicons through the exploitation of comparable corpora to create corpus-driven multilingual dictionaries. To date, this issue has not been widely addressed. This paper focuses on the use of the mechanism of collocational networks proposed by Williams (1998) for exploiting comparable corpora. The paper first provides a description of the METRICC project, which is aimed at the automatically creation of comparable corpora and describes one of the crawlers developed for comparable corpora building, and then discusses the power of collocational networks for multilingual corpus-driven dictionary development

    The Influence of Features of Collocations on the Collocational Knowledge and Development of Kurdish High School Students: A Longitudinal Study

    Get PDF
    This study explored the influence of four features of collocations- frequency of occurrence, syntactic structure, semantic transparency, and congruency with L1- on the collocational knowledge and development of 252 Kurdish high school learners of English as a foreign language. The importance of collocations in learning English as a second or foreign language and the difficulties that challenge learners at different levels of language proficiency have been well established. However, few studies have adopted a longitudinal research design or a hybrid definition of collocations, incorporating both frequency-based and phraseological views. The present study took this approach to explore learners’ collocational knowledge and development and the influence of features of collocations on their collocational knowledge and development at the high school level of learning English as a foreign language. The study employed two tests: an appropriateness judgement test to measure learners’ receptive knowledge and a gap-filling test to measure their productive knowledge of collocations. The data were collected in two waves, one at the beginning of their school year and the other at the end. Data analyses were conducted to determine the relationship between features of collocations and learners’ collocational knowledge and development. The results revealed frequency of occurrence as the most influential factor affecting learners’ knowledge and development. Influence of the syntactic structure of collocations on the learners’ knowledge and development came second whereas congruency with L1 occupied the third position. Semantic transparency seemed to have the least influence on their collocational knowledge and development. Gender appeared as an influential factor in the individual tests. However, its influence was not significant in terms of overall knowledge development. In general, the results indicated that learners’ productive collocational knowledge lagged behind their receptive. However, receptive and productive collocational knowledge did not increase at the same rate over the study period. While learners’ receptive collocational knowledge did not show an increase in knowledge, their productive knowledge increased significantly over the school year. The results also revealed that grammatical collocations were less challenging than lexical collocations at this level of language learning. Finally, according to the study results, some pedagogical implications and suggestions for further studies are presented.Kurdistan Regional Government (KRG
    • 

    corecore