3 research outputs found
Frequency vs. Association for Constraint Selection in Usage-Based Construction Grammar
A usage-based Construction Grammar (CxG) posits that slot-constraints
generalize from common exemplar constructions. But what is the best model of
constraint generalization? This paper evaluates competing frequency-based and
association-based models across eight languages using a metric derived from the
Minimum Description Length paradigm. The experiments show that
association-based models produce better generalizations across all languages by
a significant margin
Generative Models of Monolingual and Bilingual Gappy Patterns
A growing body of machine translation research aims to exploit lexical patterns (e.g., ngrams and phrase pairs) with gaps (Simard et al., 2005; Chiang, 2005; Xiong et al., 2011). Typically, these “gappy patterns” are discovered using heuristics based on word alignments or local statistics such as mutual information. In this paper, we develop generative models of monolingual and parallel text that build sentences using gappy patterns of arbitrary length and with arbitrarily many gaps. We exploit Bayesian nonparametrics and collapsed Gibbs sampling to discover salient patterns in a corpus. We evaluate the patterns qualitatively and also add them as features to an MT system, reporting promising preliminary results.</p