499 research outputs found
A Data-Oriented Approach to Semantic Interpretation
In Data-Oriented Parsing (DOP), an annotated language corpus is used as a
stochastic grammar. The most probable analysis of a new input sentence is
constructed by combining sub-analyses from the corpus in the most probable way.
This approach has been succesfully used for syntactic analysis, using corpora
with syntactic annotations such as the Penn Treebank. If a corpus with
semantically annotated sentences is used, the same approach can also generate
the most probable semantic interpretation of an input sentence. The present
paper explains this semantic interpretation method, and summarizes the results
of a preliminary experiment. Semantic annotations were added to the syntactic
annotations of most of the sentences of the ATIS corpus. A data-oriented
semantic interpretation algorithm was succesfully tested on this semantically
enriched corpus.Comment: 10 pages, Postscript; to appear in Proceedings Workshop on
Corpus-Oriented Semantic Analysis, ECAI-96, Budapes
Can Subcategorisation Probabilities Help a Statistical Parser?
Research into the automatic acquisition of lexical information from corpora
is starting to produce large-scale computational lexicons containing data on
the relative frequencies of subcategorisation alternatives for individual
verbal predicates. However, the empirical question of whether this type of
frequency information can in practice improve the accuracy of a statistical
parser has not yet been answered. In this paper we describe an experiment with
a wide-coverage statistical grammar and parser for English and
subcategorisation frequencies acquired from ten million words of text which
shows that this information can significantly improve parse accuracy.Comment: 9 pages, uses colacl.st
Crossings as a side effect of dependency lengths
The syntactic structure of sentences exhibits a striking regularity:
dependencies tend to not cross when drawn above the sentence. We investigate
two competing explanations. The traditional hypothesis is that this trend
arises from an independent principle of syntax that reduces crossings
practically to zero. An alternative to this view is the hypothesis that
crossings are a side effect of dependency lengths, i.e. sentences with shorter
dependency lengths should tend to have fewer crossings. We are able to reject
the traditional view in the majority of languages considered. The alternative
hypothesis can lead to a more parsimonious theory of language.Comment: the discussion section has been expanded significantly; in press in
Complexity (Wiley
- ā¦