8,493 research outputs found
Learning Language from a Large (Unannotated) Corpus
A novel approach to the fully automated, unsupervised extraction of
dependency grammars and associated syntax-to-semantic-relationship mappings
from large text corpora is described. The suggested approach builds on the
authors' prior work with the Link Grammar, RelEx and OpenCog systems, as well
as on a number of prior papers and approaches from the statistical language
learning literature. If successful, this approach would enable the mining of
all the information needed to power a natural language comprehension and
generation system, directly from a large, unannotated corpus.Comment: 29 pages, 5 figures, research proposa
TRX: A Formally Verified Parser Interpreter
Parsing is an important problem in computer science and yet surprisingly
little attention has been devoted to its formal verification. In this paper, we
present TRX: a parser interpreter formally developed in the proof assistant
Coq, capable of producing formally correct parsers. We are using parsing
expression grammars (PEGs), a formalism essentially representing recursive
descent parsing, which we consider an attractive alternative to context-free
grammars (CFGs). From this formalization we can extract a parser for an
arbitrary PEG grammar with the warranty of total correctness, i.e., the
resulting parser is terminating and correct with respect to its grammar and the
semantics of PEGs; both properties formally proven in Coq.Comment: 26 pages, LMC
- …