Search CORE

242 research outputs found

Grammar induction for mildly context sensitive languages using variational Bayesian inference

Author: Bergen Leon
Bruno Chris
Harasim Daniel
O'Donnell Timothy J.
Portelance Eva
Publication venue
Publication date: 06/08/2014
Field of study

The following technical report presents a formal approach to probabilistic minimalist grammar induction. We describe a formalization of a minimalist grammar. Based on this grammar, we define a generative model for minimalist derivations. We then present a generalized algorithm for the application of variational Bayesian inference to lexicalized mildly context sensitive language grammars which in this paper is applied to the previously defined minimalist grammar

arXiv.org e-Print Archive

Dryad Digital Repository (Duke University)

Fragment Grammars: Exploring Computation and Reuse in Language

Author: Goodman Noah D.
O'Donnell Timothy J.
Tenenbaum Joshua B.
Publication venue
Publication date: 31/03/2009
Field of study

Language relies on a division of labor between stored units and structure building operations which combine the stored units into larger structures. This division of labor leads to a tradeoff: more structure-building means less need to store while more storage means less need to compute structure. We develop a hierarchical Bayesian model called fragment grammar to explore the optimum balance between structure-building and reuse. The model is developed in the context of stochastic functional programming (SFP) and in particular using a probabilistic variant of Lisp known as the Church programming language (Goodman, Mansinghka, Roy, Bonawitz, & Tenenbaum, 2008). We show how to formalize several probabilistic models of language structure using Church, and how fragment grammar generalizes one of them---adaptor grammars (Johnson, Griffiths, & Goldwater, 2007). We conclude with experimental data with adults and preliminary evaluations of the model on natural language corpus data

DSpace@MIT

Recommended from our members

Linking Meaning to Language: Linguistic Universals and Variation

Author: Hartshorne Joshua Keiles
O'Donnell Timothy J.
Snedeker Jesse
Sudo Yasutada
Uruwashi Miki
Publication venue: Cognitive Science Society
Publication date: 30/11/2012
Field of study

To use natural language, speakers must map the participants in events or states in the world onto grammatical roles. There remains considerable disagreement about the nature of these so-called linking rules (Levin & Rappaport Hovav, 2005). In order to probe the nature of linking rules, we investigate verbs of psychological state, which demonstrate complex linking patterns both within and between languages. We find that the typical duration of the psychological state guides the application of linking rules to novel verbs in both English and Japanese, consistent with a universal constraint. Nonetheless, there are marked differences in the baseline preferences for the individual linking rules across the two languages. We discuss these findings both in terms of theories of exceptionless linking rules and accounts on which linking rules are governed by probabilistic biases as well as cross-linguistic variation.Psycholog

Harvard University - DASH

The Stable Entropy Hypothesis and Entropy-Aware Decoding: An Analysis and Algorithm for Robust Natural Language Generation

Author: Arora Kushal
Cheung Jackie C. K.
O'Donnell Timothy J.
Precup Doina
Weston Jason
Publication venue
Publication date: 13/02/2023
Field of study

State-of-the-art language generation models can degenerate when applied to open-ended generation problems such as text completion, story generation, or dialog modeling. This degeneration usually shows up in the form of incoherence, lack of vocabulary diversity, and self-repetition or copying from the context. In this paper, we postulate that ``human-like'' generations usually lie in a narrow and nearly flat entropy band, and violation of these entropy bounds correlates with degenerate behavior. Our experiments show that this stable narrow entropy zone exists across models, tasks, and domains and confirm the hypothesis that violations of this zone correlate with degeneration. We then use this insight to propose an entropy-aware decoding algorithm that respects these entropy bounds resulting in less degenerate, more contextual, and "human-like" language generation in open-ended text generation settings

arXiv.org e-Print Archive

Simplicity and learning to distinguish arguments from modifiers

Author: Edward Gibson
Leon Bergen
Timothy J. O'Donnell
Publication venue: 'Institute of Computer Science, Polish Academy of Sciences'
Publication date: 01/04/2023
Field of study

We present a learnability analysis of the argument-modifier distinction, asking whether there is information in the distribution of English constituents that could allow learners to identify which constituents are arguments and which are modifiers. We first develop a general description of some of the ways in which arguments and modifiers differ in distribution. We then identify two models from the literature that can capture these differences, which we call the argument-only model and the argument-modifier model. We employ these models using a common learning framework based on two simplicity biases which tradeoff against one another. The first bias favors a small lexicon with highly reusable lexical items, and the second, opposing, bias favors simple derivations of individual forms – those using small numbers of lexical items. Our first empirical study shows that the argument-modifier model is able to recover the argument-modifier status of many individual constituents when evaluated against a gold standard. This provides evidence in favor of our general account of the distributional differences between arguments and modifiers. It also suggests a kind of lower bound on the amount of information that a suitably equipped learner could use to identify which phrases are arguments or modifiers. We then present a series of analyses investigating how and why the argument-modifier model is able to recover the argument-modifier status of some constituents. In particular, we show that the argumentmodifier model is able to provide a simpler description of the input corpus than the argument-only model, both in terms of lexicon size, and in terms of the complexity of individual derivations. Intuitively, the argument-modifier model is able to do this because it is able to ignore spurious modifier structure when learning the lexicon. These analyses further support our general account of the differences between arguments and modifiers, as well as our simplicity-based approach to learning

Directory of Open Access Journals

Integrin Response To Altered Actin-Myosin Mechanochemistry In Cardiac Myocytes

Author: Ba Mariam A.
Baker Josh E.
Carter Michael S.
O'Donnell Timothy J.
Valencik Maria L.
Publication venue: Biophysical Society. Published by Elsevier Inc.
Publication date: 28/02/2009
Field of study

Elsevier - Publisher Connector