140 research outputs found
Textual Economy through Close Coupling of Syntax and Semantics
We focus on the production of efficient descriptions of objects, actions and
events. We define a type of efficiency, textual economy, that exploits the
hearer's recognition of inferential links to material elsewhere within a
sentence. Textual economy leads to efficient descriptions because the material
that supports such inferences has been included to satisfy independent
communicative goals, and is therefore overloaded in Pollack's sense. We argue
that achieving textual economy imposes strong requirements on the
representation and reasoning used in generating sentences. The representation
must support the generator's simultaneous consideration of syntax and
semantics. Reasoning must enable the generator to assess quickly and reliably
at any stage how the hearer will interpret the current sentence, with its
(incomplete) syntax and semantics. We show that these representational and
reasoning requirements are met in the SPUD system for sentence planning and
realization.Comment: 10 pages, uses QobiTree.te
Korean Grammar Using TAGs
This paper addresses various issues related to representing the Korean language using Tree Adjoining Grammars. Topics covered include Korean grammar using TAGs, Machine Translation between Korean and English using Synchronous Tree Adjoining Grammars (STAGs), handling scrambling using Multi Component TAGs (MC-TAGs), and recovering empty arguments. The data for the parsing is from US military communication messages
A psycholinguistically motivated version of TAG
We propose a psycholinguistically moti-vated version of TAG which is designed to model key properties of human sentence processing, viz., incrementality, connect-edness, and prediction. We use findings from human experiments to motivate an in-cremental grammar formalism that makes it possible to build fully connected struc-tures on a word-by-word basis. A key idea of the approach is to explicitly model the prediction of upcoming material and the subsequent verification and integration pro-cesses. We also propose a linking theory that links the predictions of our formalism to experimental data such as reading times, and illustrate how it can capture psycholin-guistic results on the processing of either... or structures and relative clauses.
CCG Parsing and Multiword Expressions
This thesis presents a study about the integration of information about
Multiword Expressions (MWEs) into parsing with Combinatory Categorial Grammar
(CCG). We build on previous work which has shown the benefit of adding
information about MWEs to syntactic parsing by implementing a similar pipeline
with CCG parsing. More specifically, we collapse MWEs to one token in training
and test data in CCGbank, a corpus which contains sentences annotated with CCG
derivations. Our collapsing algorithm however can only deal with MWEs when they
form a constituent in the data which is one of the limitations of our approach.
We study the effect of collapsing training and test data. A parsing effect
can be obtained if collapsed data help the parser in its decisions and a
training effect can be obtained if training on the collapsed data improves
results. We also collapse the gold standard and show that our model
significantly outperforms the baseline model on our gold standard, which
indicates that there is a training effect. We show that the baseline model
performs significantly better on our gold standard when the data are collapsed
before parsing than when the data are collapsed after parsing which indicates
that there is a parsing effect. We show that these results can lead to improved
performance on the non-collapsed standard benchmark although we fail to show
that it does so significantly. We conclude that despite the limited settings,
there are noticeable improvements from using MWEs in parsing. We discuss ways
in which the incorporation of MWEs into parsing can be improved and hypothesize
that this will lead to more substantial results.
We finally show that turning the MWE recognition part of the pipeline into an
experimental part is a useful thing to do as we obtain different results with
different recognizers.Comment: MSc thesis, The University of Edinburgh, 2014, School of Informatics,
MSc Artificial Intelligenc
A Metagrammatical Approach to Periphrasis in Gwadloupéyen
In this paper, I show that verbal and nominal functional elements of Gwadloupéyen can be described in the Tree-Adjoining Grammar as pertaining to morphological periphrasis. This challenges the claim that Creoles have fully analytical morpholog
Broad-coverage model of prediction in human sentence processing
The aim of this thesis is to design and implement a cognitively plausible theory
of sentence processing which incorporates a mechanism for modeling a prediction
and verification process in human language understanding, and to evaluate the validity
of this model on specific psycholinguistic phenomena as well as on broad-coverage,
naturally occurring text.
Modeling prediction is a timely and relevant contribution to the field because recent
experimental evidence suggests that humans predict upcoming structure or lexemes
during sentence processing. However, none of the current sentence processing theories
capture prediction explicitly. This thesis proposes a novel model of incremental
sentence processing that offers an explicit prediction and verification mechanism.
In evaluating the proposed model, this thesis also makes a methodological contribution.
The design and evaluation of current sentence processing theories are usually
based exclusively on experimental results from individual psycholinguistic experiments
on specific linguistic structures. However, a theory of language processing in
humans should not only work in an experimentally designed environment, but should
also have explanatory power for naturally occurring language.
This thesis first shows that the Dundee corpus, an eye-tracking corpus of newspaper
text, constitutes a valuable additional resource for testing sentence processing theories.
I demonstrate that a benchmark processing effect (the subject/object relative clause
asymmetry) can be detected in this data set (Chapter 4). I then evaluate two existing
theories of sentence processing, Surprisal and Dependency Locality Theory (DLT),
on the full Dundee corpus. This constitutes the first broad-coverage comparison of
sentence processing theories on naturalistic text. I find that both theories can explain
some of the variance in the eye-movement data, and that they capture different aspects
of sentence processing (Chapter 5).
In Chapter 6, I propose a new theory of sentence processing, which explicitly models
prediction and verification processes, and aims to unify the complementary aspects
of Surprisal and DLT. The proposed theory implements key cognitive concepts such
as incrementality, full connectedness, and memory decay. The underlying grammar
formalism is a strictly incremental version of Tree-adjoining Grammar (TAG), Psycholinguistically
motivated TAG (PLTAG), which is introduced in Chapter 7. I then
describe how the Penn Treebank can be converted into PLTAG format and define an
incremental, fully connected broad-coverage parsing algorithm with associated probability
model for PLTAG. Evaluation of the PLTAG model shows that it achieves the broad coverage required for testing a psycholinguistic theory on naturalistic data. On
the standardized Penn Treebank test set, it approaches the performance of incremental
TAG parsers without prediction (Chapter 8).
Chapter 9 evaluates the psycholinguistic aspects of the proposed theory by testing
it both on a on a selection of established sentence processing phenomena and on the
Dundee eye-tracking corpus. The proposed theory can account for a larger range of
psycholinguistic case studies than previous theories, and is a significant positive predictor
of reading times on broad-coverage text. I show that it can explain a larger
proportion of the variance in reading times than either DLT integration cost or Surprisal
- âŠ