1,787 research outputs found
Paracompositionality, MWEs and Argument Substitution
Multi-word expressions, verb-particle constructions, idiomatically combining
phrases, and phrasal idioms have something in common: not all of their elements
contribute to the argument structure of the predicate implicated by the
expression.
Radically lexicalized theories of grammar that avoid string-, term-, logical
form-, and tree-writing, and categorial grammars that avoid wrap operation,
make predictions about the categories involved in verb-particles and phrasal
idioms. They may require singleton types, which can only substitute for one
value, not just for one kind of value. These types are asymmetric: they can be
arguments only. They also narrowly constrain the kind of semantic value that
can correspond to such syntactic categories. Idiomatically combining phrases do
not subcategorize for singleton types, and they exploit another locally
computable and compositional property of a correspondence, that every syntactic
expression can project its head word. Such MWEs can be seen as empirically
realized categorial possibilities, rather than lacuna in a theory of
lexicalizable syntactic categories.Comment: accepted version (pre-final) for 23rd Formal Grammar Conference,
August 2018, Sofi
Anchoring a Lexicalized Tree-Adjoining Grammar for Discourse
We here explore a ``fully'' lexicalized Tree-Adjoining Grammar for discourse
that takes the basic elements of a (monologic) discourse to be not simply
clauses, but larger structures that are anchored on variously realized
discourse cues. This link with intra-sentential grammar suggests an account for
different patterns of discourse cues, while the different structures and
operations suggest three separate sources for elements of discourse meaning:
(1) a compositional semantics tied to the basic trees and operations; (2) a
presuppositional semantics carried by cue phrases that freely adjoin to trees;
and (3) general inference, that draws additional, defeasible conclusions that
flesh out what is conveyed compositionally.Comment: 7 pages, uses aclcol.st
Textual Economy through Close Coupling of Syntax and Semantics
We focus on the production of efficient descriptions of objects, actions and
events. We define a type of efficiency, textual economy, that exploits the
hearer's recognition of inferential links to material elsewhere within a
sentence. Textual economy leads to efficient descriptions because the material
that supports such inferences has been included to satisfy independent
communicative goals, and is therefore overloaded in Pollack's sense. We argue
that achieving textual economy imposes strong requirements on the
representation and reasoning used in generating sentences. The representation
must support the generator's simultaneous consideration of syntax and
semantics. Reasoning must enable the generator to assess quickly and reliably
at any stage how the hearer will interpret the current sentence, with its
(incomplete) syntax and semantics. We show that these representational and
reasoning requirements are met in the SPUD system for sentence planning and
realization.Comment: 10 pages, uses QobiTree.te
Learning to Disambiguate Syntactic Relations
Many extensions to text-based, data-intensive knowledge management approaches, such as Information Retrieval or Data Mining, focus on integrating the impressive recent advances in language technology. For this, they need fast, robust parsers that deliver linguistic data which is meaningful for the subsequent processing stages. This paper introduces such a parsing system and discusses some of its disambiguation techniques which are based on learning from a large syntactically annotated corpus.
The paper is organized as follows. Section 2 explains the motivations for writing the parser, and why it profits from Dependency grammar assumptions. Section 3 gives a brief introduction to the parsing system and to evaluation questions. Section 4 presents the probabilistic models and the conducted experiments in detail
A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena
Word reordering is one of the most difficult aspects of statistical machine
translation (SMT), and an important factor of its quality and efficiency.
Despite the vast amount of research published to date, the interest of the
community in this problem has not decreased, and no single method appears to be
strongly dominant across language pairs. Instead, the choice of the optimal
approach for a new translation task still seems to be mostly driven by
empirical trials. To orientate the reader in this vast and complex research
area, we present a comprehensive survey of word reordering viewed as a
statistical modeling challenge and as a natural language phenomenon. The survey
describes in detail how word reordering is modeled within different
string-based and tree-based SMT frameworks and as a stand-alone task, including
systematic overviews of the literature in advanced reordering modeling. We then
question why some approaches are more successful than others in different
language pairs. We argue that, besides measuring the amount of reordering, it
is important to understand which kinds of reordering occur in a given language
pair. To this end, we conduct a qualitative analysis of word reordering
phenomena in a diverse sample of language pairs, based on a large collection of
linguistic knowledge. Empirical results in the SMT literature are shown to
support the hypothesis that a few linguistic facts can be very useful to
anticipate the reordering characteristics of a language pair and to select the
SMT framework that best suits them.Comment: 44 pages, to appear in Computational Linguistic
Learning to Disambiguate Syntactic Relations
Natural Language is highly ambiguous, on every level. This article describes a fast broad-coverage state-of-the-art parser that uses a carefully hand-written grammar and probability-based machine learning approaches on the syntactic level. It is shown in detail which statistical learning models based on Maximum-Likelihood Estimation (MLE) can support a highly developed linguistic grammar in the disambiguation process
A Computational Cognitive Model of Syntactic Priming
The psycholinguistic literature has identified two syntactic adaptation effects in language production: rapidly decaying short-term priming and long-lasting adaptation. To explain both effects, we present an ACT-R model of syntactic priming based on a wide-coverage, lexicalized syntactic theory that explains priming as facilitation of lexical access. In this model, two well-established ACT-R mechanisms, base-level learning and spreading activation, account for long-term adaptation and short-term priming, respectively. Our model simulates incremental language production and in a series of modeling studies we show that it accounts for (a) the inverse frequency interaction; (b) the absence of a decay in long-term priming; and (c) the cumulativity of long-term adaptation. The model also explains the lexical boost effect and the fact that it only applies to short-term priming. We also present corpus data that verifies a prediction of the model, i.e., that the lexical boost affects all lexical material, rather than just heads. Keywords: syntactic priming, adaptation, cognitive architectures, ACT-R, categorial grammar, incrementality
An Efficient Implementation of the Head-Corner Parser
This paper describes an efficient and robust implementation of a
bi-directional, head-driven parser for constraint-based grammars. This parser
is developed for the OVIS system: a Dutch spoken dialogue system in which
information about public transport can be obtained by telephone.
After a review of the motivation for head-driven parsing strategies, and
head-corner parsing in particular, a non-deterministic version of the
head-corner parser is presented. A memoization technique is applied to obtain a
fast parser. A goal-weakening technique is introduced which greatly improves
average case efficiency, both in terms of speed and space requirements.
I argue in favor of such a memoization strategy with goal-weakening in
comparison with ordinary chart-parsers because such a strategy can be applied
selectively and therefore enormously reduces the space requirements of the
parser, while no practical loss in time-efficiency is observed. On the
contrary, experiments are described in which head-corner and left-corner
parsers implemented with selective memoization and goal weakening outperform
`standard' chart parsers. The experiments include the grammar of the OVIS
system and the Alvey NL Tools grammar.
Head-corner parsing is a mix of bottom-up and top-down processing. Certain
approaches towards robust parsing require purely bottom-up processing.
Therefore, it seems that head-corner parsing is unsuitable for such robust
parsing techniques. However, it is shown how underspecification (which arises
very naturally in a logic programming environment) can be used in the
head-corner parser to allow such robust parsing techniques. A particular robust
parsing model is described which is implemented in OVIS.Comment: 31 pages, uses cl.st
A PDTB-Styled End-to-End Discourse Parser
We have developed a full discourse parser in the Penn Discourse Treebank
(PDTB) style. Our trained parser first identifies all discourse and
non-discourse relations, locates and labels their arguments, and then
classifies their relation types. When appropriate, the attribution spans to
these relations are also determined. We present a comprehensive evaluation from
both component-wise and error-cascading perspectives.Comment: 15 pages, 5 figures, 7 table
- …