2,651 research outputs found

    A resource-light approach to morpho-syntactic tagging

    Get PDF

    Linguistic Optimization

    Get PDF
    Optimality Theory (OT) is a model of language that combines aspects of generative and connectionist linguistics. It is unique in the field in its use of a rank ordering on constraints, which is used to formalize optimization, the choice of the best of a set of potential linguistic forms. We show that phenomena argued to require ranking fall out equally from the form of optimization in OT's predecessor Harmonic Grammar (HG), which uses numerical weights to encode the relative strength of constraints. We further argue that the known problems for HG can be resolved by adopting assumptions about the nature of constraints that have precedents both in OT and elsewhere in computational and generative linguistics. This leads to a formal proof that if the range of each constraint is a bounded number of violations, HG generates a finite number of languages. This is nontrivial, since the set of possible weights for each constraint is nondenumerably infinite. We also briefly review some advantages of HG

    Corpus-based typology: Applications, challenges and some solutions

    Get PDF
    Over the last few years, the number of corpora that can be used for language comparison has dramatically increased. The corpora are so diverse in their structure, size and annotation style, that a novice might not know where to start. The present paper charts this new and changing territory, providing a few landmarks, warning signs and safe paths. Although no corpora corpus at present can replace the traditional type of typological data based on language description in reference grammars, they corpora can help with diverse tasks, being particularly well suited for investigating probabilistic and gradient properties of languages and for discovering and interpreting cross-linguistic generalizations based on processing and communicative mechanisms. At the same time, the use of corpora for typological purposes has not only advantages and opportunities, but also numerous challenges. This paper also contains an empirical case study addressing two pertinent problems: the role of text types in language comparison and the problem of the word as a comparative concept

    Emergent Typological Effects of Agent-Based Learning Models in Maximum Entropy Grammar

    Get PDF
    This dissertation shows how a theory of grammatical representations and a theory of learning can be combined to generate gradient typological predictions in phonology, predicting not only which patterns are expected to exist, but also their relative frequencies: patterns which are learned more easily are predicted to be more typologically frequent than those which are more difficult. In Chapter 1 I motivate and describe the specific implementation of this methodology in this dissertation. Maximum Entropy grammar (Goldwater & Johnson 2003) is combined with two agent-based learning models, the iterated and the interactive learning model, each of which mimics a type of learning dynamic observed in natural language acquisition. In Chapter 2 I illustrate how this system works using a simplified, abstract example typology, and show how the models generate a bias away from patterns which rely on cumulative constraint interaction ( gang effects ), and a bias away from variable patterns. Both of these biases match observed trends in natural language typology and psycholinguistic experiments. Chapter 3 further explores the models\u27 bias away from cumulative constraint interaction using an empirical test case: the typology of possible patterns of contrast between two fricatives. This typology yields five possible patterns, the rarest of which is the result of a gang effect. The results of simulations performed with both models produce a bias against the gang effect pattern. Chapter 4 further explores the models\u27 bias away from variation using evidence from artificial grammar learning experiments, in which human participants show a bias away from variable patterns (e.g. Smith & Wonnacott 2010). This test case was chosen additionally to disambiguate between variable behavior within a lexical item (variation), and variable behavior across lexical items (exceptionality). The results of simulations performed with both learning models are consistent with the observed bias away from variable patterns in humans. The results of the iterated and interactive learning models presented in this dissertation provide support for the use of this methodology in investigating the typological predictions of linguistic theories of grammar and learning, as well as in addressing broader questions regarding the source of gradient typological trends, and whether certain properties of natural language must be innately specified, or might emerge through other means

    The entropy of words-learnability and expressivity across more than 1000 languages

    Get PDF
    The choice associated with words is a fundamental property of natural languages. It lies at the heart of quantitative linguistics, computational linguistics and language sciences more generally. Information theory gives us tools at hand to measure precisely the average amount of choice associated with words: the word entropy. Here, we use three parallel corpora, encompassing ca. 450 million words in 1916 texts and 1259 languages, to tackle some of the major conceptual and practical problems of word entropy estimation: dependence on text size, register, style and estimation method, as well as non-independence of words in co-text. We present two main findings: Firstly, word entropies display relatively narrow, unimodal distributions. There is no language in our sample with a unigram entropy of less than six bits/word. We argue that this is in line with information-theoretic models of communication. Languages are held in a narrow range by two fundamental pressures: word learnability and word expressivity, with a potential bias towards expressivity. Secondly, there is a strong linear relationship between unigram entropies and entropy rates. The entropy difference between words with and without co-textual information is narrowly distributed around ca. three bits/word. In other words, knowing the preceding text reduces the uncertainty of words by roughly the same amount across languages of the world.Peer ReviewedPostprint (published version
    • …
    corecore