2 research outputs found

    Multilingual Part-of-Speech Tagging: Two Unsupervised Approaches

    Full text link
    We demonstrate the effectiveness of multilingual learning for unsupervised part-of-speech tagging. The central assumption of our work is that by combining cues from multiple languages, the structure of each becomes more apparent. We consider two ways of applying this intuition to the problem of unsupervised part-of-speech tagging: a model that directly merges tag structures for a pair of languages into a single sequence and a second model which instead incorporates multilingual context using latent variables. Both approaches are formulated as hierarchical Bayesian models, using Markov Chain Monte Carlo sampling techniques for inference. Our results demonstrate that by incorporating multilingual evidence we can achieve impressive performance gains across a range of scenarios. We also found that performance improves steadily as the number of available languages increases

    Modeling second language learners' interlanguage and its variability: a computer-based dynamic assessment approach to distinguishing between errors and mistakes

    Get PDF
    Despite a long history, interlanguage variability research is a debatable topic as most paradigms do not distinguish between competence and performance. While interlanguage performance has been proven to be variable, determining whether interlanguage competence is exposed to random and/or systematic variations is complex, given the fact that distinction between competence-dependent errors and performance-related mistakes should be established to best represent the interlanguage competence. This thesis suggests a dynamic assessment model grounded in sociocultural theory to distinguish between errors and mistakes in texts written by learners of French, to then investigate the extent to which interlanguage competence varies across time, text types, and students. The key outcomes include: 1. An expanded model based on dynamic assessment principles to distinguish between errors and mistakes, which also provides the structure to create and observe learners’ zone of proximal development; 2. A method to increase the accuracy of the part-of-speech tagging procedure whose reliability correlates with the number of incorrect words contained in learners’ texts; 3. A sociocultural insight into interlanguage variability research. Results demonstrate that interlanguage competence is as variable as performance. The main finding shows that knowledge over time is subject to not only systematic, but also unsystematic variations
    corecore