4,424 research outputs found
Exploring Automated Essay Scoring for Nonnative English Speakers
Automated Essay Scoring (AES) has been quite popular and is being widely
used. However, lack of appropriate methodology for rating nonnative English
speakers' essays has meant a lopsided advancement in this field. In this paper,
we report initial results of our experiments with nonnative AES that learns
from manual evaluation of nonnative essays. For this purpose, we conducted an
exercise in which essays written by nonnative English speakers in test
environment were rated both manually and by the automated system designed for
the experiment. In the process, we experimented with a few features to learn
about nuances linked to nonnative evaluation. The proposed methodology of
automated essay evaluation has yielded a correlation coefficient of 0.750 with
the manual evaluation.Comment: Accepted for publication at EUROPHRAS 201
Networking Phylogeny for Indo-European and Austronesian Languages
Harnessing cognitive abilities of many individuals, a language evolves upon their mutual interactions establishing a persistent social environment to which language is closely attuned. Human history is encoded in the rich sets of linguistic data by means of symmetry patterns that are not always feasibly represented by trees. Here we use the methods developed in the study of complex networks to decipher accurately symmetry records on the language phylogeny of the Indo-European and the Austronesian language families, considering, in both cases, the samples of fifty different languages. In particular, we support the Anatolian theory of Indo-European origin and the ‘express train’ model of Austronesian expansion from South-East Asia, with an essential role for the Batanes islands located between the Philippines and Taiwan
Experimental Support for a Categorical Compositional Distributional Model of Meaning
Modelling compositional meaning for sentences using empirical distributional
methods has been a challenge for computational linguists. We implement the
abstract categorical model of Coecke et al. (arXiv:1003.4394v1 [cs.CL]) using
data from the BNC and evaluate it. The implementation is based on unsupervised
learning of matrices for relational words and applying them to the vectors of
their arguments. The evaluation is based on the word disambiguation task
developed by Mitchell and Lapata (2008) for intransitive sentences, and on a
similar new experiment designed for transitive sentences. Our model matches the
results of its competitors in the first experiment, and betters them in the
second. The general improvement in results with increase in syntactic
complexity showcases the compositional power of our model.Comment: 11 pages, to be presented at EMNLP 2011, to be published in
Proceedings of the 2011 Conference on Empirical Methods in Natural Language
Processin
Children like dense neighborhoods: orthographic neighborhood density effects in novel readers
Previous evidence with English beginning readers suggests that some orthographic effects, such as the orthographic neighborhood density effects, could be stronger for children than for adults. Particularly, children respond more accurately to words with many orthographic neighbors than to words with few neighbors. The magnitude of the effects for children is much higher than for adults, and some researchers have proposed that these effects could be progressively modulated according to reading expertise. The present paper explores in depth how children from 1st to 6th grade perform a lexical decision with words that are from dense or sparse orthographic... (Leer más) neighborhoods, attending not only to accuracy measures, but also to response latencies, through a computer-controlled task. Our results reveal that children (like adults) show clear neighborhood density effects, and that these effects do not seem to depend on reading expertise. Contrarily to previous claims, the present work shows that orthographic neighborhood effects are not progressively modulated by reading skill. Further, these data strongly support the idea of a general language-independent preference for using the lexical route instead of grapheme-to-phoneme conversions, even in beginning readers. The implications of these results for developmental models in reading and for models in visual word recognition and orthographic encoding are [email protected]
Examining the acquisition of phonological word-forms with computational experiments
This is the author's accepted manuscript. The original publication is available at http://las.sagepub.com/content/early/2012/10/21/0023830912460513.full.pdfIt has been hypothesized that known words in the lexicon strengthen newly formed representations of novel words, resulting in words with dense neighborhoods being learned more quickly than words with sparse neighborhoods. Tests of this hypothesis in a connectionist network showed that words with dense neighborhoods were learned better than words with sparse neighborhoods when the network was exposed to the words all at once (Experiment 1), or gradually over time, like human word-learners (Experiment 2). This pattern was also observed despite variation in the availability of processing resources in the networks (Experiment 3). A learning advantage for words with sparse neighborhoods was observed only when the network was initially exposed to words with sparse neighborhoods and exposed to dense neighborhoods later in training (Experiment 4). The benefits of computational experiments for increasing our understanding of language processes and for the treatment of language processing disorders are discussed
- …