2,060 research outputs found
A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena
Word reordering is one of the most difficult aspects of statistical machine
translation (SMT), and an important factor of its quality and efficiency.
Despite the vast amount of research published to date, the interest of the
community in this problem has not decreased, and no single method appears to be
strongly dominant across language pairs. Instead, the choice of the optimal
approach for a new translation task still seems to be mostly driven by
empirical trials. To orientate the reader in this vast and complex research
area, we present a comprehensive survey of word reordering viewed as a
statistical modeling challenge and as a natural language phenomenon. The survey
describes in detail how word reordering is modeled within different
string-based and tree-based SMT frameworks and as a stand-alone task, including
systematic overviews of the literature in advanced reordering modeling. We then
question why some approaches are more successful than others in different
language pairs. We argue that, besides measuring the amount of reordering, it
is important to understand which kinds of reordering occur in a given language
pair. To this end, we conduct a qualitative analysis of word reordering
phenomena in a diverse sample of language pairs, based on a large collection of
linguistic knowledge. Empirical results in the SMT literature are shown to
support the hypothesis that a few linguistic facts can be very useful to
anticipate the reordering characteristics of a language pair and to select the
SMT framework that best suits them.Comment: 44 pages, to appear in Computational Linguistic
Recommended from our members
Large-scale connectionist natural language parsing using lexical semantic and syntactic knowledge
Syntactic parsing plays a pivotal role in most automatic natural language processing systems. The research project presented in this dissertation has focused on two main characteristics of connectionist models for natural language processing: their adaptability to different tagging conventions, and their ability to use multiple linguistic constraints in parallel during sentence processing. In focusing on these key characteristics, an existing hybrid connectionist, shift-reduce corpus-based parsing model has been modified. This parser, which had earlier been trained to acquire linguistic knowledge from the Lancaster Parsed Corpus, has been adapted to learn linguistic knowledge from the Wall Street Journal Corpus. This adaptation is a novel demonstration that this connectionist parser, and by extension, other similar connectionist models, is able to adapt to more than one syntactic tagging convention; this implies their ability to adapt to the underlying linguistic theories used to annotate these corpora
Neural Combinatory Constituency Parsing
東京都立大学Tokyo Metropolitan University博士(情報科学)doctoral thesi
Maximum Entropy Models For Natural Language Ambiguity Resolution
This thesis demonstrates that several important kinds of natural language ambiguities can be resolved to state-of-the-art accuracies using a single statistical modeling technique based on the principle of maximum entropy.
We discuss the problems of sentence boundary detection, part-of-speech tagging, prepositional phrase attachment, natural language parsing, and text categorization under the maximum entropy framework. In practice, we have found that maximum entropy models offer the following advantages:
State-of-the-art Accuracy: The probability models for all of the tasks discussed perform at or near state-of-the-art accuracies, or outperform competing learning algorithms when trained and tested under similar conditions. Methods which outperform those presented here require much more supervision in the form of additional human involvement or additional supporting resources.
Knowledge-Poor Features: The facts used to model the data, or features, are linguistically very simple, or knowledge-poor but yet succeed in approximating complex linguistic relationships.
Reusable Software Technology: The mathematics of the maximum entropy framework are essentially independent of any particular task, and a single software implementation can be used for all of the probability models in this thesis.
The experiments in this thesis suggest that experimenters can obtain state-of-the-art accuracies on a wide range of natural language tasks, with little task-specific effort, by using maximum entropy probability models
Evaluating Parsers with Dependency Constraints
Many syntactic parsers now score over 90% on English in-domain evaluation, but the remaining errors have been challenging to address and difficult to quantify. Standard parsing metrics provide a consistent basis for comparison between parsers, but do not illuminate what errors remain to be addressed. This thesis develops a constraint-based evaluation for dependency and Combinatory Categorial Grammar (CCG) parsers to address this deficiency. We examine the constrained and cascading impact, representing the direct and indirect effects of errors on parsing accuracy. This identifies errors that are the underlying source of problems in parses, compared to those which are a consequence of those problems. Kummerfeld et al. (2012) propose a static post-parsing analysis to categorise groups of errors into abstract classes, but this cannot account for cascading changes resulting from repairing errors, or limitations which may prevent the parser from applying a repair. In contrast, our technique is based on enforcing the presence of certain dependencies during parsing, whilst allowing the parser to choose the remainder of the analysis according to its grammar and model. We draw constraints for this process from gold-standard annotated corpora, grouping them into abstract error classes such as NP attachment, PP attachment, and clause attachment. By applying constraints from each error class in turn, we can examine how parsers respond when forced to correctly analyse each class. We show how to apply dependency constraints in three parsers: the graph-based MSTParser (McDonald and Pereira, 2006) and the transition-based ZPar (Zhang and Clark, 2011b) dependency parsers, and the C&C CCG parser (Clark and Curran, 2007b). Each is widely-used and influential in the field, and each generates some form of predicate-argument dependencies. We compare the parsers, identifying common sources of error, and differences in the distribution of errors between constrained and cascaded impact. Our work allows us to contrast the implementations of each parser, and how they respond to constraint application. Using our analysis, we experiment with new features for dependency parsing, which encode the frequency of proposed arcs in large-scale corpora derived from scanned books. These features are inspired by and extend on the work of Bansal and Klein (2011). We target these features at the most notable errors, and show how they address some, but not all of the difficult attachments across newswire and web text. CCG parsing is particularly challenging, as different derivations do not always generate different dependencies. We develop dependency hashing to address semantically redundant parses in n-best CCG parsing, and demonstrate its necessity and effectiveness. Dependency hashing substantially improves the diversity of n-best CCG parses, and improves a CCG reranker when used for creating training and test data. We show the intricacies of applying constraints to C&C, and describe instances where applying constraints causes the parser to produce a worse analysis. These results illustrate how algorithms which are relatively straightforward for constituency and dependency parsers are non-trivial to implement in CCG. This work has explored dependencies as constraints in dependency and CCG parsing. We have shown how dependency hashing can efficiently eliminate semantically redundant CCG n-best parses, and presented a new evaluation framework based on enforcing the presence of dependencies in the output of the parser. By otherwise allowing the parser to proceed as it would have, we avoid the assumptions inherent in other work. We hope this work will provide insights into the remaining errors in parsing, and target efforts to address those errors, creating better syntactic analysis for downstream applications
A Computational Model of Syntactic Processing: Ambiguity Resolution from Interpretation
Syntactic ambiguity abounds in natural language, yet humans have no
difficulty coping with it. In fact, the process of ambiguity resolution is
almost always unconscious. But it is not infallible, however, as example 1
demonstrates.
1. The horse raced past the barn fell.
This sentence is perfectly grammatical, as is evident when it appears in the
following context:
2. Two horses were being shown off to a prospective buyer. One was raced past
a meadow. and the other was raced past a barn. ...
Grammatical yet unprocessable sentences such as 1 are called `garden-path
sentences.' Their existence provides an opportunity to investigate the human
sentence processing mechanism by studying how and when it fails. The aim of
this thesis is to construct a computational model of language understanding
which can predict processing difficulty. The data to be modeled are known
examples of garden path and non-garden path sentences, and other results from
psycholinguistics.
It is widely believed that there are two distinct loci of computation in
sentence processing: syntactic parsing and semantic interpretation. One
longstanding controversy is which of these two modules bears responsibility for
the immediate resolution of ambiguity. My claim is that it is the latter, and
that the syntactic processing module is a very simple device which blindly and
faithfully constructs all possible analyses for the sentence up to the current
point of processing. The interpretive module serves as a filter, occasionally
discarding certain of these analyses which it deems less appropriate for the
ongoing discourse than their competitors.
This document is divided into three parts. The first is introductory, and
reviews a selection of proposals from the sentence processing literature. The
second part explores a body of data which has been adduced in support of a
theory of structural preferences --- one that is inconsistent with the present
claim. I show how the current proposal can be specified to account for the
available data, and moreover to predict where structural preference theories
will go wrong. The third part is a theoretical investigation of how well the
proposed architecture can be realized using current conceptions of linguistic
competence. In it, I present a parsing algorithm and a meaning-based ambiguity
resolution method.Comment: 128 pages, LaTeX source compressed and uuencoded, figures separate
macros: rotate.sty, lingmacros.sty, psfig.tex. Dissertation, Computer and
Information Science Dept., October 199
- …