748 research outputs found
Fast semantic parsing with well-typedness guarantees
AM dependency parsing is a linguistically principled method for neural
semantic parsing with high accuracy across multiple graphbanks. It relies on a
type system that models semantic valency but makes existing parsers slow. We
describe an A* parser and a transition-based parser for AM dependency parsing
which guarantee well-typedness and improve parsing speed by up to 3 orders of
magnitude, while maintaining or improving accuracy.Comment: Accepted at EMNLP 2020, camera-ready versio
Relational Constraint Driven Test Case Synthesis for Web Applications
This paper proposes a relational constraint driven technique that synthesizes
test cases automatically for web applications. Using a static analysis,
servlets can be modeled as relational transducers, which manipulate backend
databases. We present a synthesis algorithm that generates a sequence of HTTP
requests for simulating a user session. The algorithm relies on backward
symbolic image computation for reaching a certain database state, given a code
coverage objective. With a slight adaptation, the technique can be used for
discovering workflow attacks on web applications.Comment: In Proceedings TAV-WEB 2010, arXiv:1009.330
Stream Processing using Grammars and Regular Expressions
In this dissertation we study regular expression based parsing and the use of
grammatical specifications for the synthesis of fast, streaming
string-processing programs.
In the first part we develop two linear-time algorithms for regular
expression based parsing with Perl-style greedy disambiguation. The first
algorithm operates in two passes in a semi-streaming fashion, using a constant
amount of working memory and an auxiliary tape storage which is written in the
first pass and consumed by the second. The second algorithm is a single-pass
and optimally streaming algorithm which outputs as much of the parse tree as is
semantically possible based on the input prefix read so far, and resorts to
buffering as many symbols as is required to resolve the next choice. Optimality
is obtained by performing a PSPACE-complete pre-analysis on the regular
expression.
In the second part we present Kleenex, a language for expressing
high-performance streaming string processing programs as regular grammars with
embedded semantic actions, and its compilation to streaming string transducers
with worst-case linear-time performance. Its underlying theory is based on
transducer decomposition into oracle and action machines, and a finite-state
specialization of the streaming parsing algorithm presented in the first part.
In the second part we also develop a new linear-time streaming parsing
algorithm for parsing expression grammars (PEG) which generalizes the regular
grammars of Kleenex. The algorithm is based on a bottom-up tabulation algorithm
reformulated using least fixed points and evaluated using an instance of the
chaotic iteration scheme by Cousot and Cousot
What to Read: A Biased Guide to AI Literacy for the Beginner
Acknowledgements. It was Ken Forbus' idea, and he, Howie Shrobe, Dan Weld, and John Batali read various drafts. Dan Huttenlocher and Tom Knight helped with the speech recognition section. The science fiction section was prepared with the aid of my SF/AI editorial board, consisting of Carl Feynman and David Wallace, and of the ArpaNet SF-Lovers community. Even so, all responsibility rests with me.This note tries to provide a quick guide to AI literacy for the beginning AI hacker and for the experienced AI hacker or two whose scholarship isn't what it should be. most will recognize it as the same old list of classic papers, give or take a few that I feel to be under- or over-rated. It is not guaranteed to be thorough or balanced or anything like that.MIT Artificial Intelligence Laborator
Contributions to the Construction of Extensible Semantic Editors
This dissertation addresses the need for easier construction and extension of language tools. Specifically, the construction and extension of so-called semantic editors is considered, that is, editors providing semantic services for code comprehension and manipulation. Editors like these are typically found in state-of-the-art development environments, where they have been developed by hand. The list of programming languages available today is extensive and, with the lively creation of new programming languages and the evolution of old languages, it keeps growing. Many of these languages would benefit from proper tool support. Unfortunately, the development of a semantic editor can be a time-consuming and error-prone endeavor, and too large an effort for most language communities. Given the complex nature of programming, and the huge benefits of good tool support, this lack of tools is problematic. In this dissertation, an attempt is made at narrowing the gap between generative solutions and how state-of-the-art editors are constructed today. A generative alternative for construction of textual semantic editors is explored with focus on how to specify extensible semantic editor services. Specifically, this dissertation shows how semantic services can be specified using a semantic formalism called refer- ence attribute grammars (RAGs), and how these services can be made responsive enough for editing, and be provided also when the text in an editor is erroneous. Results presented in this dissertation have been found useful, both in industry and in academia, suggesting that the explored approach may help to reduce the effort of editor construction
Learning Symbolic Operators for Task and Motion Planning
Robotic planning problems in hybrid state and action spaces can be solved by
integrated task and motion planners (TAMP) that handle the complex interaction
between motion-level decisions and task-level plan feasibility. TAMP approaches
rely on domain-specific symbolic operators to guide the task-level search,
making planning efficient. In this work, we formalize and study the problem of
operator learning for TAMP. Central to this study is the view that operators
define a lossy abstraction of the transition model of a domain. We then propose
a bottom-up relational learning method for operator learning and show how the
learned operators can be used for planning in a TAMP system. Experimentally, we
provide results in three domains, including long-horizon robotic planning
tasks. We find our approach to substantially outperform several baselines,
including three graph neural network-based model-free approaches from the
recent literature. Video: https://youtu.be/iVfpX9BpBRo Code:
https://git.io/JCT0gComment: IROS 202
Natural Language Processing
The subject of Natural Language Processing can be considered in both broad and narrow senses. In the broad sense, it covers processing issues at all levels of natural language understanding, including speech recognition, syntactic and semantic analysis of sentences, reference to the discourse context (including anaphora, inference of referents, and more extended relations of discourse coherence and narrative structure), conversational inference and implicature, and discourse planning and generation. In the narrower sense, it covers the syntactic and semantic processing sentences to deliver semantic objects suitable for referring, inferring, and the like. Of course, the results of inference and reference may under some circumstances play a part in processing in the narrow sense. But the processes that are characteristic of these other modules are not the primary concern
- …