1,525 research outputs found
Forgetting Exceptions is Harmful in Language Learning
We show that in language learning, contrary to received wisdom, keeping
exceptional training instances in memory can be beneficial for generalization
accuracy. We investigate this phenomenon empirically on a selection of
benchmark natural language processing tasks: grapheme-to-phoneme conversion,
part-of-speech tagging, prepositional-phrase attachment, and base noun phrase
chunking. In a first series of experiments we combine memory-based learning
with training set editing techniques, in which instances are edited based on
their typicality and class prediction strength. Results show that editing
exceptional instances (with low typicality or low class prediction strength)
tends to harm generalization accuracy. In a second series of experiments we
compare memory-based learning and decision-tree learning methods on the same
selection of tasks, and find that decision-tree learning often performs worse
than memory-based learning. Moreover, the decrease in performance can be linked
to the degree of abstraction from exceptions (i.e., pruning or eagerness). We
provide explanations for both results in terms of the properties of the natural
language processing tasks and the learning algorithms.Comment: 31 pages, 7 figures, 10 tables. uses 11pt, fullname, a4wide tex
styles. Pre-print version of article to appear in Machine Learning 11:1-3,
Special Issue on Natural Language Learning. Figures on page 22 slightly
compressed to avoid page overloa
A Robust Parsing Algorithm For Link Grammars
In this paper we present a robust parsing algorithm based on the link grammar
formalism for parsing natural languages. Our algorithm is a natural extension
of the original dynamic programming recognition algorithm which recursively
counts the number of linkages between two words in the input sentence. The
modified algorithm uses the notion of a null link in order to allow a
connection between any pair of adjacent words, regardless of their dictionary
definitions. The algorithm proceeds by making three dynamic programming passes.
In the first pass, the input is parsed using the original algorithm which
enforces the constraints on links to ensure grammaticality. In the second pass,
the total cost of each substring of words is computed, where cost is determined
by the number of null links necessary to parse the substring. The final pass
counts the total number of parses with minimal cost. All of the original
pruning techniques have natural counterparts in the robust algorithm. When used
together with memoization, these techniques enable the algorithm to run
efficiently with cubic worst-case complexity. We have implemented these ideas
and tested them by parsing the Switchboard corpus of conversational English.
This corpus is comprised of approximately three million words of text,
corresponding to more than 150 hours of transcribed speech collected from
telephone conversations restricted to 70 different topics. Although only a
small fraction of the sentences in this corpus are "grammatical" by standard
criteria, the robust link grammar parser is able to extract relevant structure
for a large portion of the sentences. We present the results of our experiments
using this system, including the analyses of selected and random sentences from
the corpus.Comment: 17 pages, compressed postscrip
Platform for Full-Syntax Grammar Development Using Meta-grammar Constructs
PACLIC 20 / Wuhan, China / 1-3 November, 200
F-structure transfer-based statistical machine translation
In this paper, we describe a statistical deep syntactic transfer decoder that is trained fully automatically on parsed bilingual corpora. Deep syntactic transfer rules are induced automatically from the f-structures of a LFG parsed bitext corpus by automatically aligning local f-structures, and inducing all rules consistent with the node alignment. The transfer decoder outputs the n-best TL f-structures given a SL f-structure as input by applying large numbers of transfer rules and searching for the best output using a
log-linear model to combine feature scores. The decoder includes a fully integrated dependency-based tri-gram language model. We include an experimental evaluation of the decoder using different parsing disambiguation
resources for the German data to provide a comparison of how the system performs with different German training and test parses
- …