730 research outputs found

    Robust Computer Algebra, Theorem Proving, and Oracle AI

    Get PDF
    In the context of superintelligent AI systems, the term "oracle" has two meanings. One refers to modular systems queried for domain-specific tasks. Another usage, referring to a class of systems which may be useful for addressing the value alignment and AI control problems, is a superintelligent AI system that only answers questions. The aim of this manuscript is to survey contemporary research problems related to oracles which align with long-term research goals of AI safety. We examine existing question answering systems and argue that their high degree of architectural heterogeneity makes them poor candidates for rigorous analysis as oracles. On the other hand, we identify computer algebra systems (CASs) as being primitive examples of domain-specific oracles for mathematics and argue that efforts to integrate computer algebra systems with theorem provers, systems which have largely been developed independent of one another, provide a concrete set of problems related to the notion of provable safety that has emerged in the AI safety community. We review approaches to interfacing CASs with theorem provers, describe well-defined architectural deficiencies that have been identified with CASs, and suggest possible lines of research and practical software projects for scientists interested in AI safety.Comment: 15 pages, 3 figure

    A Divergence Critic for Inductive Proof

    Full text link
    Inductive theorem provers often diverge. This paper describes a simple critic, a computer program which monitors the construction of inductive proofs attempting to identify diverging proof attempts. Divergence is recognized by means of a ``difference matching'' procedure. The critic then proposes lemmas and generalizations which ``ripple'' these differences away so that the proof can go through without divergence. The critic enables the theorem prover Spike to prove many theorems completely automatically from the definitions alone.Comment: See http://www.jair.org/ for any accompanying file

    HOL(y)Hammer: Online ATP Service for HOL Light

    Full text link
    HOL(y)Hammer is an online AI/ATP service for formal (computer-understandable) mathematics encoded in the HOL Light system. The service allows its users to upload and automatically process an arbitrary formal development (project) based on HOL Light, and to attack arbitrary conjectures that use the concepts defined in some of the uploaded projects. For that, the service uses several automated reasoning systems combined with several premise selection methods trained on all the project proofs. The projects that are readily available on the server for such query answering include the recent versions of the Flyspeck, Multivariate Analysis and Complex Analysis libraries. The service runs on a 48-CPU server, currently employing in parallel for each task 7 AI/ATP combinations and 4 decision procedures that contribute to its overall performance. The system is also available for local installation by interested users, who can customize it for their own proof development. An Emacs interface allowing parallel asynchronous queries to the service is also provided. The overall structure of the service is outlined, problems that arise and their solutions are discussed, and an initial account of using the system is given

    Domain-specific functional software testing: A progress report

    Get PDF
    Software Engineering is a knowledge intensive activity that involves defining, designing, developing, and maintaining software systems. In order to build effective systems to support Software Engineering activities, Artificial Intelligence techniques are needed. The application of Artificial Intelligence technology to Software Engineering is called Knowledge-based Software Engineering (KBSE). The goal of KBSE is to change the software life cycle such that software maintenance and evolution occur by modifying the specifications and then rederiving the implementation rather than by directly modifying the implementation. The use of domain knowledge in developing KBSE systems is crucial. Our work is mainly related to one area of KBSE that is called automatic specification acquisition. One example is the WATSON prototype on which our current work is based. WATSON is an automatic programming system for formalizing specifications for telephone switching software mainly restricted to POTS, i.e., plain old telephone service. Our current approach differentiates itself from other approaches in two antagonistic ways. On the one hand, we address a large and complex real-world problem instead of a 'toy domain' as in many research prototypes. On the other hand, to allow such scaling, we had to relax the ambitious goal of complete automatic programming, to the easier task of automatic testing

    Building an IDE for the Calculational Derivation of Imperative Programs

    Full text link
    In this paper, we describe an IDE called CAPS (Calculational Assistant for Programming from Specifications) for the interactive, calculational derivation of imperative programs. In building CAPS, our aim has been to make the IDE accessible to non-experts while retaining the overall flavor of the pen-and-paper calculational style. We discuss the overall architecture of the CAPS system, the main features of the IDE, the GUI design, and the trade-offs involved.Comment: In Proceedings F-IDE 2015, arXiv:1508.0338

    MaxSAT Resolution and Subcube Sums

    Full text link
    We study the MaxRes rule in the context of certifying unsatisfiability. We show that it can be exponentially more powerful than tree-like resolution, and when augmented with weakening (the system MaxResW), p-simulates tree-like resolution. In devising a lower bound technique specific to MaxRes (and not merely inheriting lower bounds from Res), we define a new proof system called the SubCubeSums proof system. This system, which p-simulates MaxResW, can be viewed as a special case of the semialgebraic Sherali-Adams proof system. In expressivity, it is the integral restriction of conical juntas studied in the contexts of communication complexity and extension complexity. We show that it is not simulated by Res. Using a proof technique qualitatively different from the lower bounds that MaxResW inherits from Res, we show that Tseitin contradictions on expander graphs are hard to refute in SubCubeSums. We also establish a lower bound technique via lifting: for formulas requiring large degree in SubCubeSums, their XOR-ification requires large size in SubCubeSums

    A New View on Worst-Case to Average-Case Reductions for NP Problems

    Full text link
    We study the result by Bogdanov and Trevisan (FOCS, 2003), who show that under reasonable assumptions, there is no non-adaptive worst-case to average-case reduction that bases the average-case hardness of an NP-problem on the worst-case complexity of an NP-complete problem. We replace the hiding and the heavy samples protocol in [BT03] by employing the histogram verification protocol of Haitner, Mahmoody and Xiao (CCC, 2010), which proves to be very useful in this context. Once the histogram is verified, our hiding protocol is directly public-coin, whereas the intuition behind the original protocol inherently relies on private coins
    corecore