4,424 research outputs found

    Exploring Automated Essay Scoring for Nonnative English Speakers

    Full text link
    Automated Essay Scoring (AES) has been quite popular and is being widely used. However, lack of appropriate methodology for rating nonnative English speakers' essays has meant a lopsided advancement in this field. In this paper, we report initial results of our experiments with nonnative AES that learns from manual evaluation of nonnative essays. For this purpose, we conducted an exercise in which essays written by nonnative English speakers in test environment were rated both manually and by the automated system designed for the experiment. In the process, we experimented with a few features to learn about nuances linked to nonnative evaluation. The proposed methodology of automated essay evaluation has yielded a correlation coefficient of 0.750 with the manual evaluation.Comment: Accepted for publication at EUROPHRAS 201

    Investigating formulaic language as a marker of Authorship

    Get PDF

    Networking Phylogeny for Indo-European and Austronesian Languages

    Get PDF
    Harnessing cognitive abilities of many individuals, a language evolves upon their mutual interactions establishing a persistent social environment to which language is closely attuned. Human history is encoded in the rich sets of linguistic data by means of symmetry patterns that are not always feasibly represented by trees. Here we use the methods developed in the study of complex networks to decipher accurately symmetry records on the language phylogeny of the Indo-European and the Austronesian language families, considering, in both cases, the samples of fifty different languages. In particular, we support the Anatolian theory of Indo-European origin and the ‘express train’ model of Austronesian expansion from South-East Asia, with an essential role for the Batanes islands located between the Philippines and Taiwan

    Experimental Support for a Categorical Compositional Distributional Model of Meaning

    Full text link
    Modelling compositional meaning for sentences using empirical distributional methods has been a challenge for computational linguists. We implement the abstract categorical model of Coecke et al. (arXiv:1003.4394v1 [cs.CL]) using data from the BNC and evaluate it. The implementation is based on unsupervised learning of matrices for relational words and applying them to the vectors of their arguments. The evaluation is based on the word disambiguation task developed by Mitchell and Lapata (2008) for intransitive sentences, and on a similar new experiment designed for transitive sentences. Our model matches the results of its competitors in the first experiment, and betters them in the second. The general improvement in results with increase in syntactic complexity showcases the compositional power of our model.Comment: 11 pages, to be presented at EMNLP 2011, to be published in Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processin

    The Study of Text with Reference to Spanish

    Get PDF

    Children like dense neighborhoods: orthographic neighborhood density effects in novel readers

    Get PDF
    Previous evidence with English beginning readers suggests that some orthographic effects, such as the orthographic neighborhood density effects, could be stronger for children than for adults. Particularly, children respond more accurately to words with many orthographic neighbors than to words with few neighbors. The magnitude of the effects for children is much higher than for adults, and some researchers have proposed that these effects could be progressively modulated according to reading expertise. The present paper explores in depth how children from 1st to 6th grade perform a lexical decision with words that are from dense or sparse orthographic... (Leer más) neighborhoods, attending not only to accuracy measures, but also to response latencies, through a computer-controlled task. Our results reveal that children (like adults) show clear neighborhood density effects, and that these effects do not seem to depend on reading expertise. Contrarily to previous claims, the present work shows that orthographic neighborhood effects are not progressively modulated by reading skill. Further, these data strongly support the idea of a general language-independent preference for using the lexical route instead of grapheme-to-phoneme conversions, even in beginning readers. The implications of these results for developmental models in reading and for models in visual word recognition and orthographic encoding are [email protected]

    Examining the acquisition of phonological word-forms with computational experiments

    Get PDF
    This is the author's accepted manuscript. The original publication is available at http://las.sagepub.com/content/early/2012/10/21/0023830912460513.full.pdfIt has been hypothesized that known words in the lexicon strengthen newly formed representations of novel words, resulting in words with dense neighborhoods being learned more quickly than words with sparse neighborhoods. Tests of this hypothesis in a connectionist network showed that words with dense neighborhoods were learned better than words with sparse neighborhoods when the network was exposed to the words all at once (Experiment 1), or gradually over time, like human word-learners (Experiment 2). This pattern was also observed despite variation in the availability of processing resources in the networks (Experiment 3). A learning advantage for words with sparse neighborhoods was observed only when the network was initially exposed to words with sparse neighborhoods and exposed to dense neighborhoods later in training (Experiment 4). The benefits of computational experiments for increasing our understanding of language processes and for the treatment of language processing disorders are discussed
    • …
    corecore