11,018 research outputs found

    C-structures and f-structures for the British national corpus

    Get PDF
    We describe how the British National Corpus (BNC), a one hundred million word balanced corpus of British English, was parsed into Lexical Functional Grammar (LFG) c-structures and f-structures, using a treebank-based parsing architecture. The parsing architecture uses a state-of-the-art statistical parser and reranker trained on the Penn Treebank to produce context-free phrase structure trees, and an annotation algorithm to automatically annotate these trees into LFG f-structures. We describe the pre-processing steps which were taken to accommodate the differences between the Penn Treebank and the BNC. Some of the issues encountered in applying the parsing architecture on such a large scale are discussed. The process of annotating a gold standard set of 1,000 parse trees is described. We present evaluation results obtained by evaluating the c-structures produced by the statistical parser against the c-structure gold standard. We also present the results obtained by evaluating the f-structures produced by the annotation algorithm against an automatically constructed f-structure gold standard. The c-structures achieve an f-score of 83.7% and the f-structures an f-score of 91.2%

    Trialing project-based learning in a new EAP ESP course: A collaborative reflective practice of three college English teachers

    Get PDF
    Currently in many Chinese universities, the traditional College English course is facing the risk of being ‘marginalized’, replaced or even removed, and many hours previously allocated to the course are now being taken by EAP or ESP. At X University in northern China, a curriculum reform as such is taking place, as a result of which a new course has been created called ‘xue ke’ English. Despite the fact that ‘xue ke’ means subject literally, the course designer has made it clear that subject content is not the target, nor is the course the same as EAP or ESP. This curriculum initiative, while possibly having been justified with a rationale of some kind (e.g. to meet with changing social and/or academic needs of students and/or institutions), this is posing a great challenge for, as well as considerable pressure on, a number of College English teachers who have taught this single course for almost their entire teaching career. In such a context, three teachers formed a peer support group in Semester One this year, to work collaboratively co-tackling the challenge, and they chose Project-Based Learning (PBL) for the new course. This presentation will report on the implementation of this project, including the overall designing, operational procedure, and the teachers’ reflections. Based on discussion, pre-agreement was reached on the purpose and manner of collaboration as offering peer support for more effective teaching and learning and fulfilling and pleasant professional development. A WeChat group was set up as the chief platform for messaging, idea-sharing, and resource-exchanging. Physical meetings were supplementary, with sound agenda but flexible time, and venues. Mosoteach cloud class (lan mo yun ban ke) was established as a tool for virtual learning, employed both in and after class. Discussions were held at the beginning of the semester which determined only brief outlines for PBL implementation and allowed space for everyone to autonomously explore in their own way. Constant further discussions followed, which generated a great deal of opportunities for peer learning and lesson plan modifications. A reflective journal, in a greater or lesser detailed manner, was also kept by each teacher to record the journey of the collaboration. At the end of the semester, it was commonly recognized that, although challenges existed, the collaboration was overall a success and they were all willing to continue with it and endeavor to refine it to be a more professional and productive approach

    Developing computational infrastructure for the CorCenCC corpus - the National Corpus of Contemporary Welsh

    Get PDF
    CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes - National Corpus of Contemporary Welsh) is the first comprehensive corpus of Welsh designed to be reflective of language use across communication types, genres, speakers, language varieties (regional and social) and contexts. This article focuses on the computational infrastructure that we have designed to support data collection for CorCenCC, and the subsequent uses of the corpus which include lexicography, pedagogical research and corpus analysis. A grass-roots approach to design has been adopted, that has adapted and extended previous corpus-building and introduced new features as required for this specific context and language. The key pillars of the infrastructure include a framework that supports metadata collection, an innovative mobile application designed to collect spoken data (utilising a crowdsourcing approach), a backend database that stores curated data and a web-based interface that allows users to query the data online. A usability study was conducted to evaluate the user facing tools and to suggest directions for future improvements. Though the infrastructure was developed for Welsh language collection, its design can be re-used to support corpus development in other minority or major language contexts, broadening the potential utility and impact of this work

    Intelligent CALL

    Get PDF
    This chapter describes the provision of corrective feedback in Tutorial CALL, sketching the challenges in the research and development of computational parsers and grammars. The automatic evaluation and assessment of free-form learner texts paying attention to linguistic accuracy, rhetorical structures, textual complexity, and written fluency is at the centre of attention in the section on Automatic Writing Evaluation. Reading and Incidental Vocabulary Learning Aids looks at the advantages of lexical glosses, or look-up information in electronic dictionaries for reading material aimed at language learners. The conclusion looks at the role of ICALL in the context of general trends in CALL

    An analysis of formal errors in a corpus of l2 English produced by Chinese students

    Get PDF
    This paper describes the investigation of a small corpus of writing of English for academic purposes produced by L1 speakers of Mandarin. The investigation involved the development of a tagset for the identification of formal errors in the corpus, and the subsequent analysis of these errors with a view to creating remedial grammar materials for Chinese students studying in the medium of English. Some past approaches to error analysis are discussed, the process of developing the tagging system is described, and error types are identified, categorised, quantified, described and (as far as possible) explained

    Expectations eclipsed in foreign language education: learners and educators on an ongoing journey / edited by Hülya Görür-Atabaş, Sharon Turner.

    Get PDF
    Between June 2-4, 2011 Sabancı University School of Languages welcomed colleagues from 21 different countries to a collaborative exploration of the challenging and inspiring journey of learners and educators in the field of language education.\ud \ud The conference provided an opportunity for all stakeholders to share their views on language education. Colleagues met with world-renowned experts and authors in the fields of education and psychology, faculty and administrators from various universities and institutions, teachers from secondary educational backgrounds and higher education, as well as learners whose voices are often not directly shared but usually reported.\ud \ud The conference name, Eclipsing Expectations, was inspired by two natural phenomena, a solar eclipse directly before the conference, and a lunar eclipse, immediately after. Learners and educators were hereby invited to join a journey to observe, learn and exchange ideas in orde

    Wide-coverage deep statistical parsing using automatic dependency structure annotation

    Get PDF
    A number of researchers (Lin 1995; Carroll, Briscoe, and Sanfilippo 1998; Carroll et al. 2002; Clark and Hockenmaier 2002; King et al. 2003; Preiss 2003; Kaplan et al. 2004;Miyao and Tsujii 2004) have convincingly argued for the use of dependency (rather than CFG-tree) representations for parser evaluation. Preiss (2003) and Kaplan et al. (2004) conducted a number of experiments comparing “deep” hand-crafted wide-coverage with “shallow” treebank- and machine-learning based parsers at the level of dependencies, using simple and automatic methods to convert tree output generated by the shallow parsers into dependencies. In this article, we revisit the experiments in Preiss (2003) and Kaplan et al. (2004), this time using the sophisticated automatic LFG f-structure annotation methodologies of Cahill et al. (2002b, 2004) and Burke (2006), with surprising results. We compare various PCFG and history-based parsers (based on Collins, 1999; Charniak, 2000; Bikel, 2002) to find a baseline parsing system that fits best into our automatic dependency structure annotation technique. This combined system of syntactic parser and dependency structure annotation is compared to two hand-crafted, deep constraint-based parsers (Carroll and Briscoe 2002; Riezler et al. 2002). We evaluate using dependency-based gold standards (DCU 105, PARC 700, CBS 500 and dependencies for WSJ Section 22) and use the Approximate Randomization Test (Noreen 1989) to test the statistical significance of the results. Our experiments show that machine-learning-based shallow grammars augmented with sophisticated automatic dependency annotation technology outperform hand-crafted, deep, widecoverage constraint grammars. Currently our best system achieves an f-score of 82.73% against the PARC 700 Dependency Bank (King et al. 2003), a statistically significant improvement of 2.18%over the most recent results of 80.55%for the hand-crafted LFG grammar and XLE parsing system of Riezler et al. (2002), and an f-score of 80.23% against the CBS 500 Dependency Bank (Carroll, Briscoe, and Sanfilippo 1998), a statistically significant 3.66% improvement over the 76.57% achieved by the hand-crafted RASP grammar and parsing system of Carroll and Briscoe (2002)

    Directional adposition use in English, Swedish and Finnish

    Get PDF
    Directional adpositions such as to the left of describe where a Figure is in relation to a Ground. English and Swedish directional adpositions refer to the location of a Figure in relation to a Ground, whether both are static or in motion. In contrast, the Finnish directional adpositions edellä (in front of) and jäljessä (behind) solely describe the location of a moving Figure in relation to a moving Ground (Nikanne, 2003). When using directional adpositions, a frame of reference must be assumed for interpreting the meaning of directional adpositions. For example, the meaning of to the left of in English can be based on a relative (speaker or listener based) reference frame or an intrinsic (object based) reference frame (Levinson, 1996). When a Figure and a Ground are both in motion, it is possible for a Figure to be described as being behind or in front of the Ground, even if neither have intrinsic features. As shown by Walker (in preparation), there are good reasons to assume that in the latter case a motion based reference frame is involved. This means that if Finnish speakers would use edellä (in front of) and jäljessä (behind) more frequently in situations where both the Figure and Ground are in motion, a difference in reference frame use between Finnish on one hand and English and Swedish on the other could be expected. We asked native English, Swedish and Finnish speakers’ to select adpositions from a language specific list to describe the location of a Figure relative to a Ground when both were shown to be moving on a computer screen. We were interested in any differences between Finnish, English and Swedish speakers. All languages showed a predominant use of directional spatial adpositions referring to the lexical concepts TO THE LEFT OF, TO THE RIGHT OF, ABOVE and BELOW. There were no differences between the languages in directional adpositions use or reference frame use, including reference frame use based on motion. We conclude that despite differences in the grammars of the languages involved, and potential differences in reference frame system use, the three languages investigated encode Figure location in relation to Ground location in a similar way when both are in motion. Levinson, S. C. (1996). Frames of reference and Molyneux’s question: Crosslingiuistic evidence. In P. Bloom, M.A. Peterson, L. Nadel & M.F. Garrett (Eds.) Language and Space (pp.109-170). Massachusetts: MIT Press. Nikanne, U. (2003). How Finnish postpositions see the axis system. In E. van der Zee & J. Slack (Eds.), Representing direction in language and space. Oxford, UK: Oxford University Press. Walker, C. (in preparation). Motion encoding in language, the use of spatial locatives in a motion context. Unpublished doctoral dissertation, University of Lincoln, Lincoln. United Kingdo
    corecore