Search CORE

26,956 research outputs found

Building a Corpus of 2L English for Automatic Assessment: the CLEC Corpus

Author: Calderón López María Isabel
Merino Ferradá María del Carmen
Noya Gallardo María del Carmen
Zarco Tejada María Ángeles
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

In this paper we describe the CLEC corpus, an ongoing project set up at the University of Cádiz with the purpose of building up a large corpus of English as a 2L classified according to CEFR proficiency levels and formed to train statistical models for automatic proficiency assessment. The goal of this corpus is twofold: on the one hand it will be used as a data resource for the development of automatic text classification systems and, on the other, it has been used as a means of teaching innovation techniques

Repositorio de Objetos de Docencia e Investigación de la Universidad de Cádiz

Readers’ cognitive processes during IELTS reading tests: evidence from eye tracking

Author: Bax Stephen
Publication venue: British Council, ELT Research Papers 13-06
Publication date: 01/01/2013
Field of study

The research described in this report investigates readers' mental processes as they complete onscreen IELTS (International English Language Testing System) reading test items. It employs up-to-date eye tracking technology to research readers' eye movements and aims, among other things, to contribute to an understanding of the cognitive validity of reading test items (Glaser. 1991; Field forthcoming). Participants were a group of Malaysian undergraduates (n=71) taking an onscreen test consisting of two IELTS reading passages with a total of 11 test items. The eye movements of a random sample of these participants (n=38) were tracked. Questionnaire and stimulated recall interview data were also collected, and were important in order to interpret and explain the eye tracking data. Findings demonstrated significant differences between successful and unsuccessful test-takers on a number of dimensions, including their ability to read expeditiously (Khalifa and Weir. 2009). and their focus on particular aspects of the test items and the reading texts. This demonstrates the potential of eye tracking, in combination with post- hoc interview and questionnaire data, to offer new insights into the cognitive processes of successful and unsuccessful candidates in a reading test. It also gives unprecedented insights into the cognitive processing of successful and unsuccessful readers doing language tests. As a consequence, the findings should be of value to teachers and learners, and also to examination boards seeking to validate and prepare reading tests, as well as psycholinguists and others interested in the cognitive processes of readers

Natural language processing

Author: Adams
Amsler
Bangalore
Barker
Benoît
Bian
Bondale
Carrick
Ceric
Chandrasekar
Chang
Charniak
Chen
Chowdhury
Chowdhury
Costantino
Cowie
Craven
Craven
Craven
Dogru
Evans
Feldman
Fernandez
Gaizauskas
Glasgow
Haas
Hayes
Hayes
Hedlund
Herath
Ide
Isahara
Jelinek
Jeong
Jurafsky
Kazakov
Kehler
Khoo
Kim
King
Lange
Lee
Lehmam
Lehtokangas
Lewis
Liddy
Liddy
Lovis
Ma
Magnini
Mani
Manning
Marquez
Martinez
Martinez
McMurchie
Meyer
Mihalcea
Mock
Moens
Morin
Narita
Nerbonne
Oard
Ogura
Oudet
Owei
Paris
Pasero
Pedersen
Perez-Carballo
Petreley
Pirkola
Poesio
Rosenfield
Roux
Say
Scarlett
Schenker
Silber
Smeaton
Smeaton
Smith
Sokol
Song
Sparck Jones
Staab
Stock
Tolle
Trybula
Tsuda
Vickery
Waldrop
Warner
Weigard
Wilks
Wong
Yang
Yang
Zadrozny
Zweigenbaum
Publication venue: 'Wiley'
Publication date: 01/01/2003
Field of study

Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems

OPUS - University of Technology Sydney

Computational Sociolinguistics: A Survey

Author: de Jong Franciska
Doğruöz A. Seza
Nguyen Dong
Rosé Carolyn P.
Publication venue
Publication date: 01/01/2016
Field of study

Language is a social phenomenon and variation is inherent to its social nature. Recently, there has been a surge of interest within the computational linguistics (CL) community in the social dimension of language. In this article we present a survey of the emerging field of "Computational Sociolinguistics" that reflects this increased interest. We aim to provide a comprehensive overview of CL research on sociolinguistic themes, featuring topics such as the relation between language and social identity, language use in social interaction and multilingual communication. Moreover, we demonstrate the potential for synergy between the research communities involved, by showing how the large-scale data-driven methods that are widely used in CL can complement existing sociolinguistic studies, and how sociolinguistics can inform and challenge the methods and assumptions employed in CL studies. We hope to convey the possible benefits of a closer collaboration between the two communities and conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication: 18th February, 201

arXiv.org e-Print Archive

EUR Research Repository

University of Twente Research Information

AFRILEX 2002: 7th international conference of the African Association for Lexicography: Culture and dictionaries: programme and abstracts

Author: de Schryver Gilles-Maurice
Publication venue: (SF)2 Press
Publication date: 01/01/2002
Field of study

A Large-Scale Comparison of Historical Text Normalization Systems

Author: Bollmann Marcel
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

There is no consensus on the state-of-the-art approach to historical text normalization. Many techniques have been proposed, including rule-based methods, distance metrics, character-based statistical machine translation, and neural encoder--decoder models, but studies have used different datasets, different evaluation methods, and have come to different conclusions. This paper presents the largest study of historical text normalization done so far. We critically survey the existing literature and report experiments on eight languages, comparing systems spanning all categories of proposed normalization techniques, analysing the effect of training data quantity, and using different evaluation methods. The datasets and scripts are made publicly available.Comment: Accepted at NAACL 201

arXiv.org e-Print Archive

Publikationer från Linköpings universitet

Copenhagen University Research Information System

Digitala Vetenskapliga Arkivet - Academic Archive On-line