3,355 research outputs found
A Tutorial on the Expectation-Maximization Algorithm Including Maximum-Likelihood Estimation and EM Training of Probabilistic Context-Free Grammars
The paper gives a brief review of the expectation-maximization algorithm
(Dempster 1977) in the comprehensible framework of discrete mathematics. In
Section 2, two prominent estimation methods, the relative-frequency estimation
and the maximum-likelihood estimation are presented. Section 3 is dedicated to
the expectation-maximization algorithm and a simpler variant, the generalized
expectation-maximization algorithm. In Section 4, two loaded dice are rolled. A
more interesting example is presented in Section 5: The estimation of
probabilistic context-free grammars.Comment: Presented at the 15th European Summer School in Logic, Language and
Information (ESSLLI 2003). Example 5 extended (and partially corrected
Predicting the Position of Attributive Adjectives in the French NP
Cet article est une version révisée de l'article paru dans Student session of the European Summer School for Logic, Language and Information, Copenhague : Danemark (2010)International audienceThis article proposes a quantitative study of the placement alternation for the adjective within the noun phrase in French. Taking the hypothesis that position constraints are mostly preferential as a starting point, we develop a methodology based on statistical inference in order to provide a formal account of the relative importance of different groups of constraints. Results show the relative importance of lexical constraints and that frequency-based and length constraints are the best predictors. This suggests that the placement of adjectives not only depends on our knowledge of lexical items but also on the knowledge of the way in which we use them in discourse, i.e. on usage
An Efficient Distribution of Labor in a Two Stage Robust Interpretation Process
Although Minimum Distance Parsing (MDP) offers a theoretically attractive
solution to the problem of extragrammaticality, it is often computationally
infeasible in large scale practical applications. In this paper we present an
alternative approach where the labor is distributed between a more restrictive
partial parser and a repair module. Though two stage approaches have grown in
popularity in recent years because of their efficiency, they have done so at
the cost of requiring hand coded repair heuristics. In contrast, our two stage
approach does not require any hand coded knowledge sources dedicated to repair,
thus making it possible to achieve a similar run time advantage over MDP
without losing the quality of domain independence.Comment: 9 pages, 1 Postscript figure, uses aclap.sty and psfig.tex, In
Proceedings of EMNLP 199
Comparing and evaluating extended Lambek calculi
Lambeks Syntactic Calculus, commonly referred to as the Lambek calculus, was
innovative in many ways, notably as a precursor of linear logic. But it also
showed that we could treat our grammatical framework as a logic (as opposed to
a logical theory). However, though it was successful in giving at least a basic
treatment of many linguistic phenomena, it was also clear that a slightly more
expressive logical calculus was needed for many other cases. Therefore, many
extensions and variants of the Lambek calculus have been proposed, since the
eighties and up until the present day. As a result, there is now a large class
of calculi, each with its own empirical successes and theoretical results, but
also each with its own logical primitives. This raises the question: how do we
compare and evaluate these different logical formalisms? To answer this
question, I present two unifying frameworks for these extended Lambek calculi.
Both are proof net calculi with graph contraction criteria. The first calculus
is a very general system: you specify the structure of your sequents and it
gives you the connectives and contractions which correspond to it. The calculus
can be extended with structural rules, which translate directly into graph
rewrite rules. The second calculus is first-order (multiplicative
intuitionistic) linear logic, which turns out to have several other,
independently proposed extensions of the Lambek calculus as fragments. I will
illustrate the use of each calculus in building bridges between analyses
proposed in different frameworks, in highlighting differences and in helping to
identify problems.Comment: Empirical advances in categorial grammars, Aug 2015, Barcelona,
Spain. 201
Forgetting complex propositions
This paper uses possible-world semantics to model the changes that may occur
in an agent's knowledge as she loses information. This builds on previous work
in which the agent may forget the truth-value of an atomic proposition, to a
more general case where she may forget the truth-value of a propositional
formula. The generalization poses some challenges, since in order to forget
whether a complex proposition is the case, the agent must also lose
information about the propositional atoms that appear in it, and there is no
unambiguous way to go about this.
We resolve this situation by considering expressions of the form
, which quantify over all possible (but
minimal) ways of forgetting whether . Propositional atoms are modified
non-deterministically, although uniformly, in all possible worlds. We then
represent this within action model logic in order to give a sound and complete
axiomatization for a logic with knowledge and forgetting. Finally, some
variants are discussed, such as when an agent forgets (rather than
forgets whether ) and when the modification of atomic facts is done
non-uniformly throughout the model
Complexity of Grammar Induction for Quantum Types
Most categorical models of meaning use a functor from the syntactic category
to the semantic category. When semantic information is available, the problem
of grammar induction can therefore be defined as finding preimages of the
semantic types under this forgetful functor, lifting the information flow from
the semantic level to a valid reduction at the syntactic level. We study the
complexity of grammar induction, and show that for a variety of type systems,
including pivotal and compact closed categories, the grammar induction problem
is NP-complete. Our approach could be extended to linguistic type systems such
as autonomous or bi-closed categories.Comment: In Proceedings QPL 2014, arXiv:1412.810
- …