16,220 research outputs found
Automated Fixing of Programs with Contracts
This paper describes AutoFix, an automatic debugging technique that can fix
faults in general-purpose software. To provide high-quality fix suggestions and
to enable automation of the whole debugging process, AutoFix relies on the
presence of simple specification elements in the form of contracts (such as
pre- and postconditions). Using contracts enhances the precision of dynamic
analysis techniques for fault detection and localization, and for validating
fixes. The only required user input to the AutoFix supporting tool is then a
faulty program annotated with contracts; the tool produces a collection of
validated fixes for the fault ranked according to an estimate of their
suitability.
In an extensive experimental evaluation, we applied AutoFix to over 200
faults in four code bases of different maturity and quality (of implementation
and of contracts). AutoFix successfully fixed 42% of the faults, producing, in
the majority of cases, corrections of quality comparable to those competent
programmers would write; the used computational resources were modest, with an
average time per fix below 20 minutes on commodity hardware. These figures
compare favorably to the state of the art in automated program fixing, and
demonstrate that the AutoFix approach is successfully applicable to reduce the
debugging burden in real-world scenarios.Comment: Minor changes after proofreadin
Bioinformatics service reconciliation by heterogeneous schema transformation
This paper focuses on the problem of bioinformatics service reconciliation in a generic and scalable manner so as to enhance interoperability in a highly evolving field. Using XML as a common representation format, but also supporting existing flat-file representation formats, we propose an approach for the scalable semi-automatic reconciliation of services, possibly invoked from within a scientific workflows tool. Service reconciliation may use the AutoMed heterogeneous data integration system as an intermediary service, or may use AutoMed to produce services that mediate between services. We discuss the application of our approach for the reconciliation of services in an example bioinformatics workflow. The main contribution of this research is an architecture for the scalable reconciliation of bioinformatics services
AutoBayes: A System for Generating Data Analysis Programs from Statistical Models
Data analysis is an important scientific task which is required whenever information needs to be extracted from raw data. Statistical approaches to data analysis, which use methods from probability theory and numerical analysis, are well-founded but difficult to implement: the development of a statistical data analysis program for any given application is time-consuming and requires substantial knowledge and experience in several areas. In this paper, we describe AutoBayes, a program synthesis system for the generation of data analysis programs from statistical models. A statistical model specifies the properties for each problem variable (i.e., observation or parameter) and its dependencies in the form of a probability distribution. It is a fully declarative problem description, similar in spirit to a set of differential equations. From such a model, AutoBayes generates optimized and fully commented C/C++ code which can be linked dynamically into the Matlab and Octave environments. Code is produced by a schema-guided deductive synthesis process. A schema consists of a code template and applicability constraints which are checked against the model during synthesis using theorem proving technology. AutoBayes augments schema-guided synthesis by symbolic-algebraic computation and can thus derive closed-form solutions for many problems. It is well-suited for tasks like estimating best-fitting model parameters for the given data. Here, we describe AutoBayes's system architecture, in particular the schema-guided synthesis kernel. Its capabilities are illustrated by a number of advanced textbook examples and benchmarks
Automatic generation of audio content for open learning resources
This paper describes how digital talking books (DTBs) with embedded functionality for learners can be generated from content structured according to the OU OpenLearn schema. It includes examples showing how a software transformation developed from open source components can be used to remix OpenLearn content, and discusses issues concerning the generation of synthesised speech for educational purposes. Factors which may affect the quality of a learner's experience with open educational audio resources are identified, and in conclusion plans for testing the effect of these factors are outlined
HUDDL for description and archive of hydrographic binary data
Many of the attempts to introduce a universal hydrographic binary data format have failed or have been only partially successful. In essence, this is because such formats either have to simplify the data to such an extent that they only support the lowest common subset of all the formats covered, or they attempt to be a superset of all formats and quickly become cumbersome. Neither choice works well in practice. This paper presents a different approach: a standardized description of (past, present, and future) data formats using the Hydrographic Universal Data Description Language (HUDDL), a descriptive language implemented using the Extensible Markup Language (XML). That is, XML is used to provide a structural and physical description of a data format, rather than the content of a particular file. Done correctly, this opens the possibility of automatically generating both multi-language data parsers and documentation for format specification based on their HUDDL descriptions, as well as providing easy version control of them. This solution also provides a powerful approach for archiving a structural description of data along with the data, so that binary data will be easy to access in the future. Intending to provide a relatively low-effort solution to index the wide range of existing formats, we suggest the creation of a catalogue of format descriptions, each of them capturing the logical and physical specifications for a given data format (with its subsequent upgrades). A C/C++ parser code generator is used as an example prototype of one of the possible advantages of the adoption of such a hydrographic data format catalogue
Synthesizing Certified Code
Code certification is a lightweight approach for formally demonstrating software quality. Its basic idea is to require code producers to provide formal proofs that their code satisfies certain quality properties. These proofs serve as certificates that can be checked independently. Since code certification uses the same underlying technology as program verification, it requires detailed annotations (e.g., loop invariants) to make the proofs possible. However, manually adding annotations to the code is time-consuming and error-prone. We address this problem by combining code certification with automatic program synthesis. Given a high-level specification, our approach simultaneously generates code and all annotations required to certify the generated code. We describe a certification extension of AutoBayes, a synthesis tool for automatically generating data analysis programs. Based on built-in domain knowledge, proof annotations are added and used to generate proof obligations that are discharged by the automated theorem prover E-SETHEO. We demonstrate our approach by certifying operator- and memory-safety on a data-classification program. For this program, our approach was faster and more precise than PolySpace, a commercial static analysis tool
Transitioning Applications to Semantic Web Services: An Automated Formal Approach
Semantic Web Services have been recognized as a promising technology that exhibits huge commercial potential, and attract significant attention from both industry and the research community. Despite expectations being high, the industrial take-up of Semantic Web Service technologies has been slower than expected. One of the main reasons is that many systems have been developed without considering the potential of the web in integrating services and sharing resources. Without a systematic methodology and proper tool support, the migration from legacy systems to Semantic Web Service-based systems can be a very tedious and expensive process, which carries a definite risk of failure. There is an urgent need to provide strategies which allow the migration of legacy systems to Semantic Web Services platforms, and also tools to support such a strategy. In this paper we propose a methodology for transitioning these applications to Semantic Web Services by taking the advantage of rigorous mathematical methods. Our methodology allows users to migrate their applications to Semantic Web Services platform automatically or semi-automatically
The Automatic Inference of State Invariants in TIM
As planning is applied to larger and richer domains the effort involved in
constructing domain descriptions increases and becomes a significant burden on
the human application designer. If general planners are to be applied
successfully to large and complex domains it is necessary to provide the domain
designer with some assistance in building correctly encoded domains. One way of
doing this is to provide domain-independent techniques for extracting, from a
domain description, knowledge that is implicit in that description and that can
assist domain designers in debugging domain descriptions. This knowledge can
also be exploited to improve the performance of planners: several researchers
have explored the potential of state invariants in speeding up the performance
of domain-independent planners. In this paper we describe a process by which
state invariants can be extracted from the automatically inferred type
structure of a domain. These techniques are being developed for exploitation by
STAN, a Graphplan based planner that employs state analysis techniques to
enhance its performance
An assertion language for constraint logic programs
In an advanced program development environment, such as that discussed in the introduction of this book, several tools may coexist which handle both the program and information on the program in different ways. Also, these tools may interact among themselves and with the user. Thus, the different tools and the user need some way to communicate. It is our design principie that such communication be performed in terms of assertions. Assertions are syntactic objects which allow expressing properties of programs. Several assertion languages have been used in the past in different contexts, mainly related to program debugging. In this chapter we propose a general language of assertions which is used in different tools for validation and debugging of constraint logic programs in the context of the DiSCiPl project. The assertion language proposed is parametric w.r.t. the particular constraint domain and properties of interest being used in each different tool. The language proposed is quite general in that it poses few restrictions on the kind of properties which may be expressed. We believe the assertion language we propose is of practical relevance and appropriate for the different uses required in the tools considered
- ā¦