2,084 research outputs found

    Proof-Pattern Recognition and Lemma Discovery in ACL2

    Full text link
    We present a novel technique for combining statistical machine learning for proof-pattern recognition with symbolic methods for lemma discovery. The resulting tool, ACL2(ml), gathers proof statistics and uses statistical pattern-recognition to pre-processes data from libraries, and then suggests auxiliary lemmas in new proofs by analogy with already seen examples. This paper presents the implementation of ACL2(ml) alongside theoretical descriptions of the proof-pattern recognition and lemma discovery methods involved in it

    LiFtEr: Language to Encode Induction Heuristics for Isabelle/HOL

    Full text link
    Proof assistants, such as Isabelle/HOL, offer tools to facilitate inductive theorem proving. Isabelle experts know how to use these tools effectively; however, there is a little tool support for transferring this expert knowledge to a wider user audience. To address this problem, we present our domain-specific language, LiFtEr. LiFtEr allows experienced Isabelle users to encode their induction heuristics in a style independent of any problem domain. LiFtEr's interpreter mechanically checks if a given application of induction tool matches the heuristics, thus automating the knowledge transfer loop.Comment: This is the pre-print of our paper of the same title accepted at APLAS2019 (https://doi.org/10.1007/978-3-030-34175-6_14). We updated the draft after fixing the errata found by Kenji Miyamot

    A heuristic-based approach to code-smell detection

    Get PDF
    Encapsulation and data hiding are central tenets of the object oriented paradigm. Deciding what data and behaviour to form into a class and where to draw the line between its public and private details can make the difference between a class that is an understandable, flexible and reusable abstraction and one which is not. This decision is a difficult one and may easily result in poor encapsulation which can then have serious implications for a number of system qualities. It is often hard to identify such encapsulation problems within large software systems until they cause a maintenance problem (which is usually too late) and attempting to perform such analysis manually can also be tedious and error prone. Two of the common encapsulation problems that can arise as a consequence of this decomposition process are data classes and god classes. Typically, these two problems occur together – data classes are lacking in functionality that has typically been sucked into an over-complicated and domineering god class. This paper describes the architecture of a tool which automatically detects data and god classes that has been developed as a plug-in for the Eclipse IDE. The technique has been evaluated in a controlled study on two large open source systems which compare the tool results to similar work by Marinescu, who employs a metrics-based approach to detecting such features. The study provides some valuable insights into the strengths and weaknesses of the two approache

    Component library retrieval using property models

    Get PDF
    The re-use of products such as code, specifications, design decisions and documentation has been proposed as a method for increasing software productivity and reliability. A major problem that has still to be adequately solved is the storage and retrieval of re-usable 'components'. Current methods, such as keyword retrieval and catalogues, rely on the use of names to describe components or categories. This is inadequate for all but a few well established components and categories; in the majority of cases names do not convey sufficient information on which to base a decision to retrieve. One approach to this problem is to describe components using a formal specification. However this is impractical for two reasons; firstly, the limitations of theorem proving would severely restrict the complexity of components that could be retrieved and secondly the retrieval mechanism would need to have a method of retrieving components with 'similar' specifications. This thesis proposes the use of formal 'property' models to represent the key functionality of components. Retrieval of components can then take place on the basis of a property model produced by the library's users. These models only describe the key properties of a component, thereby making the task of comparing properties feasible. Views are introduced as a method of relating similar, non identical property models, and the use of these views facilitates the re-use of components with similar properties. The language Miramod has been developed for the purpose of describing components, and a Miramod compiler and property prover which allow Miramod models to be compared for similarity, have been designed and implemented. These tools have indicated that model based component library retrieval is feasible at relatively low levels of the programming process, and future work is suggested to extend the method to encompass earlier stages in the development of large systems

    Using features for automated problem solving

    Get PDF
    We motivate and present an architecture for problem solving where an abstraction layer of "features" plays the key role in determining methods to apply. The system is presented in the context of theorem proving with Isabelle, and we demonstrate how this approach to encoding control knowledge is expressively different to other common techniques. We look closely at two areas where the feature layer may offer benefits to theorem proving — semi-automation and learning — and find strong evidence that in these particular domains, the approach shows compelling promise. The system includes a graphical theorem-proving user interface for Eclipse ProofGeneral and is available from the project web page, http://feasch.heneveld.org

    A Survey of Paraphrasing and Textual Entailment Methods

    Full text link
    Paraphrasing methods recognize, generate, or extract phrases, sentences, or longer natural language expressions that convey almost the same information. Textual entailment methods, on the other hand, recognize, generate, or extract pairs of natural language expressions, such that a human who reads (and trusts) the first element of a pair would most likely infer that the other element is also true. Paraphrasing can be seen as bidirectional textual entailment and methods from the two areas are often similar. Both kinds of methods are useful, at least in principle, in a wide range of natural language processing applications, including question answering, summarization, text generation, and machine translation. We summarize key ideas from the two areas by considering in turn recognition, generation, and extraction methods, also pointing to prominent articles and resources.Comment: Technical Report, Natural Language Processing Group, Department of Informatics, Athens University of Economics and Business, Greece, 201

    Applying Formal Methods to Networking: Theory, Techniques and Applications

    Full text link
    Despite its great importance, modern network infrastructure is remarkable for the lack of rigor in its engineering. The Internet which began as a research experiment was never designed to handle the users and applications it hosts today. The lack of formalization of the Internet architecture meant limited abstractions and modularity, especially for the control and management planes, thus requiring for every new need a new protocol built from scratch. This led to an unwieldy ossified Internet architecture resistant to any attempts at formal verification, and an Internet culture where expediency and pragmatism are favored over formal correctness. Fortunately, recent work in the space of clean slate Internet design---especially, the software defined networking (SDN) paradigm---offers the Internet community another chance to develop the right kind of architecture and abstractions. This has also led to a great resurgence in interest of applying formal methods to specification, verification, and synthesis of networking protocols and applications. In this paper, we present a self-contained tutorial of the formidable amount of work that has been done in formal methods, and present a survey of its applications to networking.Comment: 30 pages, submitted to IEEE Communications Surveys and Tutorial

    Structural usability techniques for dependable HCI.

    Get PDF
    Since their invention in the middle of the twentieth century, interactive computerised systems have become more and more common to the point of ubiquity. While formal techniques have developed as tools for understanding and proving things about the behaviour of computerised systems, those that involve interaction with human users present some particular challenges which are less well addressed by traditional formal methods. There is an under-explored space where interaction and the high assurances provided by formal approaches meet. This thesis presents two techniques which fit into this space, and which can be used to automatically build and analyse formal models of the interaction behaviour of existing systems. Model discovery is a technique for building a state space-based formal model of the interaction behaviour of a running system. The approach systematically and exhaustively simulates the actions of a user of the system; this is a dynamic analysis technique which requires tight integration with the running system and (in practice) its codebase but which, when set up, can proceed entirely automatically. Theorem discovery is a technique for analysing a state space-based formal model of the interaction behaviour of a system, looking for strings of user actions that have equivalent effects across all states of the system. The approach systematically computes and compares the effects of ever-longer strings of actions, though insights can also arise from strings that are almost equivalent, and also from considering the meaning of sets of such equivalences. The thesis introduces and exemplifies each technique, considers how they may be used together, and demonstrates their utility and novelty, with case studies

    Democratizing Self-Service Data Preparation through Example Guided Program Synthesis,

    Full text link
    The majority of real-world data we can access today have one thing in common: they are not immediately usable in their original state. Trapped in a swamp of data usability issues like non-standard data formats and heterogeneous data sources, most data analysts and machine learning practitioners have to burden themselves with "data janitor" work, writing ad-hoc Python, PERL or SQL scripts, which is tedious and inefficient. It is estimated that data scientists or analysts typically spend 80% of their time in preparing data, a significant amount of human effort that can be redirected to better goals. In this dissertation, we accomplish this task by harnessing knowledge such as examples and other useful hints from the end user. We develop program synthesis techniques guided by heuristics and machine learning, which effectively make data preparation less painful and more efficient to perform by data users, particularly those with little to no programming experience. Data transformation, also called data wrangling or data munging, is an important task in data preparation, seeking to convert data from one format to a different (often more structured) format. Our system Foofah shows that allowing end users to describe their desired transformation, through providing small input-output transformation examples, can significantly reduce the overall user effort. The underlying program synthesizer can often succeed in finding meaningful data transformation programs within a reasonably short amount of time. Our second system, CLX, demonstrates that sometimes the user does not even need to provide complete input-output examples, but only label ones that are desirable if they exist in the original dataset. The system is still capable of suggesting reasonable and explainable transformation operations to fix the non-standard data format issue in a dataset full of heterogeneous data with varied formats. PRISM, our third system, targets a data preparation task of data integration, i.e., combining multiple relations to formulate a desired schema. PRISM allows the user to describe the target schema using not only high-resolution (precise) constraints of complete example data records in the target schema, but also (imprecise) constraints of varied resolutions, such as incomplete data record examples with missing values, value ranges, or multiple possible values in each element (cell), so as to require less familiarity of the database contents from the end user.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163059/1/markjin_1.pd
    • …
    corecore