Search CORE

85,506 research outputs found

The Effect of Non-tightness on Bayesian Estimation of PCFGs

Author: Cohen S. B.
Johnson M.
Publication venue
Publication date: 01/01/2013
Field of study

Probabilistic context-free grammars have the unusual property of not always defining tight distributions (i.e., the sum of the “probabilities” of the trees the grammar generates can be less than one). This paper reviews how this non-tightness can arise and discusses its impact on Bayesian estimation of PCFGs. We begin by presenting the notion of “almost everywhere tight grammars ” and show that linear CFGs follow it. We then propose three different ways of reinterpreting non-tight PCFGs to make them tight, show that the Bayesian estimators in Johnson et al. (2007) are correct under one of them, and provide MCMC samplers for the other two. We conclude with a discussion of the impact of tightness empirically.

CiteSeerX

Edinburgh Research Explorer

Macquarie University ResearchOnline

Tubular chemical reactors: the "lumping approximation" and bifurcation of oscillatory states

Author: Cohen Donald S.
Poore Aubrey B.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/11/1974
Field of study

We study axial heat and mass transfer in a highly diffusive tubular chemical reactor in which a simple reaction is occurring. Singular perturbation techniques are used to derive approximate equations governing the situation. Attention is then focused on the bifurcation of oscillatory states from steady operating characteristics. By means of numerical calculations and phase plane illustrations we follow the bifurcated periodic solution branches along their complete lengths

Caltech Authors

Viterbi Training for PCFGs: Hardness Results and Competitiveness of Uniform Initialization

Author: Cohen S. B.
Smith N. A.
Publication venue
Publication date: 01/01/2010
Field of study

We consider the search for a maximum likelihood assignment of hidden derivations and grammar weights for a probabilistic context-free grammar, the problem approximately solved by “Viterbi training.” We show that solving and even approximating Viterbi training for PCFGs is NP-hard. We motivate the use of uniformat-random initialization for Viterbi EM as an optimal initializer in absence of further information about the correct model parameters, providing an approximate bound on the log-likelihood.

CiteSeerX

Edinburgh Research Explorer

Empirical Risk Minimization for Probabilistic Grammars: Sample Complexity and Hardness of Learning

Author: Cohen S. B.
Smith N. A.
Publication venue
Publication date: 01/01/2012
Field of study

Probabilistic grammars are generative statistical models that are useful for compositional and sequential structures. They are used ubiquitously in computational linguistics. We present a framework, reminiscent of structural risk minimization, for empirical risk minimization of probabilistic grammars using the log-loss. We derive sample complexity bounds in this framework that apply both to the supervised setting and the unsupervised setting. By making assumptions about the underlying distribution that are appropriate for natural language scenarios, we are able to derive distribution-dependent sample complexity bounds for probabilistic grammars. We also give simple algorithms for carrying out empirical risk minimization using this framework in both the supervised and unsupervised settings. In the unsupervised case, we show that the problem of minimizing empirical risk is NP-hard. We therefore suggest an approximate algorithm, similar to expectation-maximization, to minimize the empirical risk. Learning from data is central to contemporary computational linguistics. It is in common in such learning to estimate a model in a parametric family using the maximum likelihood principle. This principle applies in the supervised case (i.e., using annotate

CiteSeerX

Edinburgh Research Explorer

Empirical Risk Minimization with Approximations of Probabilistic Grammars

Author: Cohen S. B.
Smith N. A.
Publication venue
Publication date: 01/01/2010
Field of study

Probabilistic grammars are generative statistical models that are useful for compositional and sequential structures. We present a framework, reminiscent of structural risk minimization, for empirical risk minimization of the parameters of a fixed probabilistic grammar using the log-loss. We derive sample complexity bounds in this framework that apply both to the supervised setting and the unsupervised setting.

CiteSeerX

Edinburgh Research Explorer

Inverse problems connected with two-point boundary value problems

Author: Cohen Donald S.
Hansen Erik B.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/05/1971
Field of study

For the purpose of studying those properties of a nonlinear function

f(u)

for which the two-point boundary value problem

u''+\lambda f(u)=0 (00

, the authors construct a number of kinds of special examples. "Inverse" in the title refers to the fact that the multiplicity is specified first and then a suitable function

f

is constructed

Caltech Authors

Joint Morphological and Syntactic Disambiguation

Author: Cohen S. B.
Smith N. A.
Publication venue
Publication date: 01/01/2007
Field of study

In morphologically rich languages, should morphological and syntactic disambiguation be treated sequentially or as a single problem? We describe several efficient, probabilistically interpretable ways to apply joint inference to morphological and syntactic disambiguation using lattice parsing. Joint inference is shown to compare favorably to pipeline parsing methods across a variety of component models. State-of-the-art performance on Hebrew Treebank parsing is demonstrated using the new method. The benefits of joint inference are modest with the current component models, but appear to increase as components themselves improve

CiteSeerX

Edinburgh Research Explorer

Some Positone Problems Suggested by Nonlinear Heat Generation

Author: Cohen Donald S.
Keller Herbert B.
Publication venue: 'Indiana University Press (Project Muse)'
Publication date: 01/12/1967
Field of study

There is much current interest in boundary value problems containing positive linear differential operators and monotone functions of the dependent variable, see for example, M.A. Krasnosel'ski [1] and H. H. Schaefer [2]. We call such problems "positone" and shall examine here a particular class of them (which have been called non-linear eigenvalue problems in [2])

Caltech Authors

Sharp Fronts Due to Diffusion and Viscoelastic Relaxation in Polymers

Author: Cohen Donald S.
White Andrew B., Jr.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/04/1991
Field of study

A model for sharp fronts in glassy polymers is derived and analyzed. The major effect of a diffusing penetrant on the polymer entanglement network is taken to be the inducement of a differential viscoelastic stress. This couples diffusive and mechanical processes through a viscoelastic response where the strain depends upon the amount of penetrant present. Analytically, the major effect is to produce explicit delay terms via a relaxation parameter. This accounts for the fundamental difference between a polymer in its rubbery state and the polymer in its glassy state, namely the finite relaxation time in the glassy state due to slow response to changing conditions. Both numerical and analytical perturbation studies of a boundary value problem for a dry glass polymer exposed to a penetrant solvent are completed. Concentration profiles in good agreement with observations are obtained

Caltech Authors

Feature Selection via Coalitional Game Theory

Author: Cohen S. B.
Dror G.
Ruppin E.
Publication venue
Publication date: 01/01/2007
Field of study

We present and study the contribution-selection algorithm (CSA), a novel algorithm for feature selection. The algorithm is based on the multiperturbation shapley analysis (MSA), a framework that relies on game theory to estimate usefulness. The algorithm iteratively estimates the usefulness of features and selects them accordingly, using either forward selection or backward elimination. It can optimize various performance measures over unseen data such as accuracy, balanced error rate, and area under receiver-operator-characteristic curve. Empirical comparison with several other existing feature selection methods shows that the backward elimination variant of CSA leads to the most accurate classification results on an array of data sets

CiteSeerX

Edinburgh Research Explorer