51 research outputs found

    Benchmarking Compositionality with Formal Languages

    Get PDF
    Recombining known primitive concepts into larger novel combinations is a quintessentially human cognitive capability. Whether large neural models in NLP can acquire this ability while learning from data is an open question. In this paper, we investigate this problem from the perspective of formal languages. We use deterministic finite-state transducers to make an unbounded number of datasets with controllable properties governing compositionality. By randomly sampling over many transducers, we explore which of their properties contribute to learnability of a compositional relation by a neural network. We find that the models either learn the relations completely or not at all. The key is transition coverage, setting a soft learnability limit at 400 examples per transition

    The computational nature of stress assignment

    Get PDF
    While computational studies of stress patterns as phonotactics have yielded restrictive characterizations of stress (Rogers et al., 2013) with provably correct learning procedures (Heinz, 2009), an outstanding question is the nature of stress assignment as a function which assigns stress to an underlying bare string of syllables. This paper fills this gap by locating stress patterns with respect to the subsequential class of functions (Mohri, 1997), which are argued to be important for phonology in that the vast majority of phonological functions fall within the subsequential boundary (Heinz & Lai, 2013; Chandlee, 2014), with the notable exception of tone and vowel harmony (Jardine, 2016; McCollum et al., under review). The main result is that – while most, if not all quantity insensitive (QI) stress systems are subsequential functions – the same does not hold for quantity sensitive (QS) systems. Counter-intuitively, so-called default-to-opposite QS patterns are subsequential, but default-to-same QS patterns are provably not. It also supports the claim of Jardine (2016) that certain tonal patterns are non-sequential because their suprasegmental nature allows for more a more powerful computation. As stress assignment is also suprasegmental, the existence of non-sequential stress functions adds evidence for this conclusion

    Strict Locality and Phonological Maps

    Get PDF

    Experiments using semantics for learning language comprehension and production

    No full text
    Several questions in natural language learning may be addressed by studying formal language learning models. In this work we hope to contribute to a deeper understanding of the role of semantics in language acquisition. We propose a simple formal model of meaning and denotation using finite state transducers, and an algorithm that learns a meaning function from examples consisting of a situation and an utterance denoting something in the situation. We describe the results of testing this algorithm in a domain of geometric shapes and their properties and relations in several natural languages: Arabic, English, Greek, Hebrew, Hindi, Mandarin, Russian, Spanish, and Turkish. In addition, we explore how a learner who has learned to comprehend utterances might go about learning to produce them, and present experimental results for this task. One concrete goal of our formal model is to be able to give an account of interactions in which an adult provides a meaning-preserving and grammatically correct expansion of a child's incomplete utterance

    Learning Local Phonological Processes

    Get PDF
    We present a learning algorithm for local phonological processes that relies on a restriction on the expressive power needed to compute phonological patterns that apply locally. Representing phonological processes as a functional mapping from an input to output form (an assumption compatible with either the SPE or OT formalism), the learner assumes the target process can be described with the functional counterpart to the Strictly Local (McNaughton and Papert 1971, Rogers and Pullum 2011) formal languages. Given a data set of input-output string pairs, the learner applies the two-stage grammatical induction procedure of 1) constructing a prefix tree representation of the input and 2) generalizing the pattern to words not found in the data set by merging states (Garcia and Vidal 1990, Oncina et al. 1993, Heinz 2007, 2009, de la Higuera 2010). The learner’s criterion for state merging enforces a locality requirement on the kind of function it can converge to and thereby directly reflects its own hypothesis space. We demonstrate with the example of German final devoicing, using a corpus of string pairs derived from the CELEX2 lemma corpus. The implications of our results include a proposal for how humans generalize to learn phonological patterns and a consequent explanation for why local phonological patterns have this property

    An Algebraic Characterization of Total Input Strictly Local Functions

    Get PDF
    This paper provides an algebraic characteriza- tion of the total input strictly local functions. Simultaneous, noniterative rules of the form A→B/C D, common in phonology, are defin- able as functions in this class whenever CAD represents a finite set of strings. The algebraic characterization highlights a fundamental con- nection between input strictly local functions and the simple class of definite string languages, as well as connections to string functions stud- ied in the computer science literature, the def- inite functions and local functions. No effec- tive decision procedure for the input strictly local maps was previously available, but one arises directly from this characterization. This work also shows that, unlike the full class, a restricted subclass is closed under composition. Additionally, some products are defined which may yield new factorization methods

    Learning Phonological Mappings by Learning Strictly Local Functions

    Get PDF
    In this paper we identify strict locality as a defining computational property of the input-output mapping that underlies local phonological processes. We provide an automata-theoretic characterization for the class of Strictly Local functions, which are based on the well-studied Strictly Local formal languages (McNaughton & Papert 1971; Rogers & Pullum 2011; Rogers et al. 2013), and show how they can model a range of phonological processes. We then present a learning algorithm, the SLFLA, which uses the defining property of strict locality as an inductive principle to learn these mappings from finite data. The algorithm is a modification of an algorithm developed by Oncina et al. (1993) (called OSTIA) for learning the class of subsequential functions, of which the SL functions are a proper subset. We provide a proof that the SLFLA learns the class of SL functions and discuss these results alongside previous studies on using OSTIA to learn phonological mappings (Gildea and Jurafsky 1996)

    Computational Locality in Morphological Maps

    Get PDF

    Inducing Probabilistic Grammars by Bayesian Model Merging

    Full text link
    We describe a framework for inducing probabilistic grammars from corpora of positive samples. First, samples are {\em incorporated} by adding ad-hoc rules to a working grammar; subsequently, elements of the model (such as states or nonterminals) are {\em merged} to achieve generalization and a more compact representation. The choice of what to merge and when to stop is governed by the Bayesian posterior probability of the grammar given the data, which formalizes a trade-off between a close fit to the data and a default preference for simpler models (`Occam's Razor'). The general scheme is illustrated using three types of probabilistic grammars: Hidden Markov models, class-based nn-grams, and stochastic context-free grammars.Comment: To appear in Grammatical Inference and Applications, Second International Colloquium on Grammatical Inference; Springer Verlag, 1994. 13 page
    • …
    corecore