15,741 research outputs found
Partial Learning Using Link Grammars Data
International audienceKanazawa has shown that several non-trivial classes of cate- gorial grammars are learnable in Gold's model. We propose in this article to adapt this kind of symbolic learning to natural languages. In order to compensate the combinatorial explosion of the learning algorithm, we suppose that a small part of the grammar to be learned is given as in- put. That is why we need some initial data to test the feasibility of the approach: link grammars are closely related to categorial grammars, and we use the English lexicon which exists in this formalism
Discovery of Linguistic Relations Using Lexical Attraction
This work has been motivated by two long term goals: to understand how humans
learn language and to build programs that can understand language. Using a
representation that makes the relevant features explicit is a prerequisite for
successful learning and understanding. Therefore, I chose to represent
relations between individual words explicitly in my model. Lexical attraction
is defined as the likelihood of such relations. I introduce a new class of
probabilistic language models named lexical attraction models which can
represent long distance relations between words and I formalize this new class
of models using information theory.
Within the framework of lexical attraction, I developed an unsupervised
language acquisition program that learns to identify linguistic relations in a
given sentence. The only explicitly represented linguistic knowledge in the
program is lexical attraction. There is no initial grammar or lexicon built in
and the only input is raw text. Learning and processing are interdigitated. The
processor uses the regularities detected by the learner to impose structure on
the input. This structure enables the learner to detect higher level
regularities. Using this bootstrapping procedure, the program was trained on
100 million words of Associated Press material and was able to achieve 60%
precision and 50% recall in finding relations between content-words. Using
knowledge of lexical attraction, the program can identify the correct relations
in syntactically ambiguous sentences such as ``I saw the Statue of Liberty
flying over New York.''Comment: dissertation, 56 page
Learning Language from a Large (Unannotated) Corpus
A novel approach to the fully automated, unsupervised extraction of
dependency grammars and associated syntax-to-semantic-relationship mappings
from large text corpora is described. The suggested approach builds on the
authors' prior work with the Link Grammar, RelEx and OpenCog systems, as well
as on a number of prior papers and approaches from the statistical language
learning literature. If successful, this approach would enable the mining of
all the information needed to power a natural language comprehension and
generation system, directly from a large, unannotated corpus.Comment: 29 pages, 5 figures, research proposa
Elimination of Spurious Ambiguity in Transition-Based Dependency Parsing
We present a novel technique to remove spurious ambiguity from transition
systems for dependency parsing. Our technique chooses a canonical sequence of
transition operations (computation) for a given dependency tree. Our technique
can be applied to a large class of bottom-up transition systems, including for
instance Nivre (2004) and Attardi (2006)
Some Novel Applications of Explanation-Based Learning to Parsing Lexicalized Tree-Adjoining Grammars
In this paper we present some novel applications of Explanation-Based
Learning (EBL) technique to parsing Lexicalized Tree-Adjoining grammars. The
novel aspects are (a) immediate generalization of parses in the training set,
(b) generalization over recursive structures and (c) representation of
generalized parses as Finite State Transducers. A highly impoverished parser
called a ``stapler'' has also been introduced. We present experimental results
using EBL for different corpora and architectures to show the effectiveness of
our approach.Comment: uuencoded postscript fil
- …