1,765 research outputs found
Learning Language from a Large (Unannotated) Corpus
A novel approach to the fully automated, unsupervised extraction of
dependency grammars and associated syntax-to-semantic-relationship mappings
from large text corpora is described. The suggested approach builds on the
authors' prior work with the Link Grammar, RelEx and OpenCog systems, as well
as on a number of prior papers and approaches from the statistical language
learning literature. If successful, this approach would enable the mining of
all the information needed to power a natural language comprehension and
generation system, directly from a large, unannotated corpus.Comment: 29 pages, 5 figures, research proposa
Automated generation of program translation and verification tools using annotated grammars
Automatically generating program translators from source and target language specifications is a non-trivial problem. In this paper we focus on the problem of automating the process of building translators between operations languages, a family of DSLs used to program satellite operations procedures. We exploit their similarities to semi-automatically build transformation tools between these DSLs. The input to our method is a collection of annotated context-free grammars. To simplify the overall translation process even more, we also propose an intermediate representation common to all operations languages. Finally, we discuss how to enrich our annotated grammars model with more advanced semantic annotations to provide a verification system for the translation process. We validate our approach by semi-automatically deriving translators between some real world operations languages, using the prototype tool which we implemented for that purpose
Research in the Language, Information and Computation Laboratory of the University of Pennsylvania
This report takes its name from the Computational Linguistics Feedback Forum (CLiFF), an informal discussion group for students and faculty. However the scope of the research covered in this report is broader than the title might suggest; this is the yearly report of the LINC Lab, the Language, Information and Computation Laboratory of the University of Pennsylvania.
It may at first be hard to see the threads that bind together the work presented here, work by faculty, graduate students and postdocs in the Computer Science and Linguistics Departments, and the Institute for Research in Cognitive Science. It includes prototypical Natural Language fields such as: Combinatorial Categorial Grammars, Tree Adjoining Grammars, syntactic parsing and the syntax-semantics interface; but it extends to statistical methods, plan inference, instruction understanding, intonation, causal reasoning, free word order languages, geometric reasoning, medical informatics, connectionism, and language acquisition.
Naturally, this introduction cannot spell out all the connections between these abstracts; we invite you to explore them on your own. In fact, with this issue itâs easier than ever to do so: this document is accessible on the âinformation superhighwayâ. Just call up http://www.cis.upenn.edu/~cliff-group/94/cliffnotes.html
In addition, you can find many of the papers referenced in the CLiFF Notes on the net. Most can be obtained by following links from the authorsâ abstracts in the web version of this report.
The abstracts describe the researchersâ many areas of investigation, explain their shared concerns, and present some interesting work in Cognitive Science. We hope its new online format makes the CLiFF Notes a more useful and interesting guide to Computational Linguistics activity at Penn
A Falsification View of Success Typing
Dynamic languages are praised for their flexibility and expressiveness, but
static analysis often yields many false positives and verification is
cumbersome for lack of structure. Hence, unit testing is the prevalent
incomplete method for validating programs in such languages.
Falsification is an alternative approach that uncovers definite errors in
programs. A falsifier computes a set of inputs that definitely crash a program.
Success typing is a type-based approach to document programs in dynamic
languages. We demonstrate that success typing is, in fact, an instance of
falsification by mapping success (input) types into suitable logic formulae.
Output types are represented by recursive types. We prove the correctness of
our mapping (which establishes that success typing is falsification) and we
report some experiences with a prototype implementation.Comment: extended versio
- âŠ