16 research outputs found
Learning to Disambiguate Syntactic Relations
Natural Language is highly ambiguous, on every level. This article describes a fast broad-coverage state-of-the-art parser that uses a carefully hand-written grammar and probability-based machine learning approaches on the syntactic level. It is shown in detail which statistical learning models based on Maximum-Likelihood Estimation (MLE) can support a highly developed linguistic grammar in the disambiguation process
Structure Sharing and Parallelization in a GB Parser
By utilizing structure sharing among its parse trees, a GB parser can increase its efficiency dramatically. Using a GB parser which has as its phrase structure recovery component an implementation of Tomita's algorithm (as described in [Tom86]), we investigate how a GB parser can preserve the structure sharing output by Tomita's algorithm. In this report, we discuss the implications of using Tomita's algorithm in GB parsing, and we give some details of the structuresharing parser currently under construction. We also discuss a method of parallelizing a GB parser, and relate it to the existing literature on parallel GB parsing. Our approach to preserving sharing within a
shared-packed forest is applicable not only to GB parsing, but anytime we want to preserve structure sharing in a parse forest in the presence of features
Verb Classes and Alternations in Bangla, German, English, and Korean
In this report, we investigate the relationship between the semantic and syntactic properties of verbs. Our work is based on the English Verb Classes and Alternations of (Levin, 1993). We explore how these classes are manifested in other languages, in particular, in Bangla, German, and Korean. Our report includes a survey and classification of several hundred verbs from these languages into the cross-linguistic equivalents of Levin's classes. We also explore ways in which our findings may be used to enhance WordNet in two ways: making the English syntactic information of WordNet more fine-grained, and making WordNet multilingual
Methods for Parallelizing Search Paths in Phrasing
Many search problems are commonly solved with combinatoric algorithms that unnecessarily duplicate and serialize work at considerable computational expense. There are techniques available that can eliminate redundant computations and perform remaining operations concurrently, effectively reducing the branching factors of these algorithms. This thesis applies these techniques to the problem of parsing natural language. The result is an efficient programming language that can reduce some of the expense associated with principle-based parsing and other search problems. The language is used to implement various natural language parsers, and the improvements are compared to those that result from implementing more deterministic theories of language processing
A Competitve Attachment Model for Resolving Syntactic Ambiguities in Natural Language Parsing
Linguistic ambiguity is the greatest obstacle to achieving practical
computational systems for natural language understanding. By
contrast, people experience surprisingly little difficulty in
interpreting ambiguous linguistic input. This dissertation explores
distributed computational techniques for mimicking the human ability
to resolve syntactic ambiguities efficiently and effectively. The
competitive attachment theory of parsing formulates the processing of
an ambiguity as a competition for activation within a hybrid
connectionist network. Determining the grammaticality of an input
relies on a new approach to distributed communication that integrates
numeric and symbolic constraints on passing features through the
parsing network. The method establishes syntactic relations both
incrementally and efficiently, and underlies the ability of the model
to establish long-distance syntactic relations using only local
communication within a network. The competitive distribution of
numeric evidence focuses the activation of the network onto a
particular structural interpretation of the input, resolving
ambiguities. In contrast to previous approaches to ambiguity
resolution, the model makes no use of explicit preference heuristics
or revision strategies. Crucially, the structural decisions of the
model conform with human preferences, without those preferences having
been incorporated explicitly into the parser. Furthermore, the
competitive dynamics of the parsing network account for additional
on-line processing data that other models of syntactic preferences
have left unaddressed.
(Also cross-referenced as UMIACS-TR-95-55