3,210 research outputs found
Synthesizing Program Input Grammars
We present an algorithm for synthesizing a context-free grammar encoding the
language of valid program inputs from a set of input examples and blackbox
access to the program. Our algorithm addresses shortcomings of existing grammar
inference algorithms, which both severely overgeneralize and are prohibitively
slow. Our implementation, GLADE, leverages the grammar synthesized by our
algorithm to fuzz test programs with structured inputs. We show that GLADE
substantially increases the incremental coverage on valid inputs compared to
two baseline fuzzers
SAGA: A project to automate the management of software production systems
The Software Automation, Generation and Administration (SAGA) project is investigating the design and construction of practical software engineering environments for developing and maintaining aerospace systems and applications software. The research includes the practical organization of the software lifecycle, configuration management, software requirements specifications, executable specifications, design methodologies, programming, verification, validation and testing, version control, maintenance, the reuse of software, software libraries, documentation, and automated management
Recommended from our members
Arcadia, a software development environment research project
The research objectives of the Arcadia project are two-fold: discovery and development of environment architecture principles and creation of novel software development tools, particularly powerful analysis tools, which will function within an environment built upon these architectural principles.Work in the architecture area is concerned with providing the framework to support integration while also supporting the often conflicting goal of extensibility. Thus, this area of research is directed toward achieving external integration by providing a consistent, uniform user interface, while still admitting customization and addition of new tools and interface functions. In an effort to also attain internal integration, research is aimed at developing mechanisms for structuring and managing the tools and data objects that populate a software development environment, while facilitating the insertion of new kinds of tools and new classes of objects.The unifying theme of work in the tools area is support for effective analysis at every stage of a software development project. Research is directed toward tools suitable for analyzing pre-implementation descriptions of software, software itself, and towards the production of testing and debugging tools. In many cases, these tools are specifically tailored for applicability to concurrent, distributed, or real-time software systems.The initial focus of Arcadia research is on creating a prototype environment, embodying the architectural principles, which supports Ada1 software development. This prototype environment is itself being developed in Ada.Arcadia is being developed by a consortium of researchers from the University of California at Irvine, the University of Colorado at Boulder, the University of Massachusetts at Amherst, TRW, Incremental Systems Corporation, and The Aerospace Corporation. This paper delineates the research objectives and describes the approaches being taken, the organization of the research endeavor, and current status of the work
Crowdsourcing Without a Crowd: Reliable Online Species Identification Using Bayesian Models to Minimize Crowd Size
We present an incremental Bayesian model that resolves key issues of crowd size and data quality for consensus labeling. We evaluate our method using data collected from a real-world citizen science program, BeeWatch, which invites members of the public in the United Kingdom to classify (label) photographs of bumblebees as one of 22 possible species. The biological recording domain poses two key and hitherto unaddressed challenges for consensus models of crowdsourcing: (1) the large number of potential species makes classification difficult, and (2) this is compounded by limited crowd availability, stemming from both the inherent difficulty of the task and the lack of relevant skills among the general public. We demonstrate that consensus labels can be reliably found in such circumstances with very small crowd sizes of around three to five users (i.e., through group sourcing). Our incremental Bayesian model, which minimizes crowd size by re-evaluating the quality of the consensus label following each species identification solicited from the crowd, is competitive with a Bayesian approach that uses a larger but fixed crowd size and outperforms majority voting. These results have important ecological applicability: biological recording programs such as BeeWatch can sustain themselves when resources such as taxonomic experts to confirm identifications by photo submitters are scarce (as is typically the case), and feedback can be provided to submitters in a timely fashion. More generally, our model provides benefits to any crowdsourced consensus labeling task where there is a cost (financial or otherwise) associated with soliciting a label
(Co-)Inductive semantics for Constraint Handling Rules
In this paper, we address the problem of defining a fixpoint semantics for
Constraint Handling Rules (CHR) that captures the behavior of both
simplification and propagation rules in a sound and complete way with respect
to their declarative semantics. Firstly, we show that the logical reading of
states with respect to a set of simplification rules can be characterized by a
least fixpoint over the transition system generated by the abstract operational
semantics of CHR. Similarly, we demonstrate that the logical reading of states
with respect to a set of propagation rules can be characterized by a greatest
fixpoint. Then, in order to take advantage of both types of rules without
losing fixpoint characterization, we present an operational semantics with
persistent. We finally establish that this semantics can be characterized by
two nested fixpoints, and we show the resulting language is an elegant
framework to program using coinductive reasoning.Comment: 17 page
Determining the Limits of Automated Program Recognition
This working paper was submitted as a Ph.D. thesis proposal.Program recognition is a program understanding technique in which stereotypic computational structures are identified in a program. From this identification and the known relationships between the structures, a hierarchical description of the program's design is recovered. The feasibility of this technique for small programs has been shown by several researchers. However, it seems unlikely that the existing program recognition systems will scale up to realistic, full-sized programs without some guidance (e.g., from a person using the recognition system as an assistant). One reason is that there are limits to what can be recovered by a purely code-driven approach. Some of the information about the program that is useful to know for common software engineering tasks, particularly maintenance, is missing from the code. Another reason guidance must be provided is to reduce the cost of recognition. To determine what guidance is appropriate, therefore, we must know what information is recoverable from the code and where the complexity of program recognition lies. I propose to study the limits of program recognition, both empirically and analytically. First, I will build an experimental system that performs recognition on realistic programs on the order of thousands of lines. This will allow me to characterize the information that can be recovered by this code-driven technique. Second, I will formally analyze the complexity of the recognition process. This will help determine how guidance can be applied most profitably to improve the efficiency of program recognition.MIT Artificial Intelligence Laborator
Non-Termination Inference of Logic Programs
We present a static analysis technique for non-termination inference of logic
programs. Our framework relies on an extension of the subsumption test, where
some specific argument positions can be instantiated while others are
generalized. We give syntactic criteria to statically identify such argument
positions from the text of a program. Atomic left looping queries are generated
bottom-up from selected subsets of the binary unfoldings of the program of
interest. We propose a set of correct algorithms for automating the approach.
Then, non-termination inference is tailored to attempt proofs of optimality of
left termination conditions computed by a termination inference tool. An
experimental evaluation is reported. When termination and non-termination
analysis produce complementary results for a logic procedure, then with respect
to the leftmost selection rule and the language used to describe sets of atomic
queries, each analysis is optimal and together, they induce a characterization
of the operational behavior of the logic procedure.Comment: Long version (algorithms and proofs included) of a paper submitted to
TOPLA
Research on knowledge representation, machine learning, and knowledge acquisition
Research in knowledge representation, machine learning, and knowledge acquisition performed at Knowledge Systems Lab. is summarized. The major goal of the research was to develop flexible, effective methods for representing the qualitative knowledge necessary for solving large problems that require symbolic reasoning as well as numerical computation. The research focused on integrating different representation methods to describe different kinds of knowledge more effectively than any one method can alone. In particular, emphasis was placed on representing and using spatial information about three dimensional objects and constraints on the arrangement of these objects in space. Another major theme is the development of robust machine learning programs that can be integrated with a variety of intelligent systems. To achieve this goal, learning methods were designed, implemented and experimented within several different problem solving environments
Automated transformation of BMF programs
Transformation is key to any program improvement process and a key to successful transformation is the medium in which it takes place. BMF (Bird-Meertens Formalism)is a medium specialised for program improvement via incremental ransformation. While much theoretical work exists, there has, to date, been a paucity of actual implementations employing BMF in an automated process of program improvement. In this paper we describe such an implementation, targeted to distributed architectures
- …