350,852 research outputs found
Semantic Foundations of Higher-Order Probabilistic Programs in Isabelle/HOL
Higher-order probabilistic programs are used to describe statistical models and machine-learning mechanisms. The programming languages for them are equipped with three features: higher-order functions, sampling, and conditioning. In this paper, we propose an Isabelle/HOL library for probabilistic programs supporting all of those three features. We extend our previous quasi-Borel theory library in Isabelle/HOL. As a basis of the theory, we formalize s-finite kernels, which is considered as a theoretical foundation of first-order probabilistic programs and a key to support conditioning of probabilistic programs. We also formalize the Borel isomorphism theorem which plays an important role in the quasi-Borel theory. Using them, we develop the s-finite measure monad on quasi-Borel spaces. Our extension enables us to describe higher-order probabilistic programs with conditioning directly as an Isabelle/HOL term whose type is that of morphisms between quasi-Borel spaces. We also implement the qbs prover for checking well-typedness of an Isabelle/HOL term as a morphism between quasi-Borel spaces. We demonstrate several verification examples of higher-order probabilistic programs with conditioning
Formal verification of higher-order probabilistic programs
Probabilistic programming provides a convenient lingua franca for writing
succinct and rigorous descriptions of probabilistic models and inference tasks.
Several probabilistic programming languages, including Anglican, Church or
Hakaru, derive their expressiveness from a powerful combination of continuous
distributions, conditioning, and higher-order functions. Although very
important for practical applications, these combined features raise fundamental
challenges for program semantics and verification. Several recent works offer
promising answers to these challenges, but their primary focus is on semantical
issues.
In this paper, we take a step further and we develop a set of program logics,
named PPV, for proving properties of programs written in an expressive
probabilistic higher-order language with continuous distributions and operators
for conditioning distributions by real-valued functions. Pleasingly, our
program logics retain the comfortable reasoning style of informal proofs thanks
to carefully selected axiomatizations of key results from probability theory.
The versatility of our logics is illustrated through the formal verification of
several intricate examples from statistics, probabilistic inference, and
machine learning. We further show the expressiveness of our logics by giving
sound embeddings of existing logics. In particular, we do this in a parametric
way by showing how the semantics idea of (unary and relational) TT-lifting can
be internalized in our logics. The soundness of PPV follows by interpreting
programs and assertions in quasi-Borel spaces (QBS), a recently proposed
variant of Borel spaces with a good structure for interpreting higher order
probabilistic programs
Learning to Prove Theorems via Interacting with Proof Assistants
Humans prove theorems by relying on substantial high-level reasoning and
problem-specific insights. Proof assistants offer a formalism that resembles
human mathematical reasoning, representing theorems in higher-order logic and
proofs as high-level tactics. However, human experts have to construct proofs
manually by entering tactics into the proof assistant. In this paper, we study
the problem of using machine learning to automate the interaction with proof
assistants. We construct CoqGym, a large-scale dataset and learning environment
containing 71K human-written proofs from 123 projects developed with the Coq
proof assistant. We develop ASTactic, a deep learning-based model that
generates tactics as programs in the form of abstract syntax trees (ASTs).
Experiments show that ASTactic trained on CoqGym can generate effective tactics
and can be used to prove new theorems not previously provable by automated
methods. Code is available at https://github.com/princeton-vl/CoqGym.Comment: Accepted to ICML 201
Logic Programs as Declarative and Procedural Bias in Inductive Logic Programming
Machine Learning is necessary for the development of Artificial Intelligence, as pointed out by Turing in his 1950 article ``Computing Machinery and Intelligence''. It is in the same article that Turing suggested the use of computational logic and background knowledge for learning. This thesis follows a logic-based machine learning approach called Inductive Logic Programming (ILP), which is advantageous over other machine learning approaches in terms of relational learning and utilising background knowledge. ILP uses logic programs as a uniform representation for hypothesis, background knowledge and examples, but its declarative bias is usually encoded using metalogical statements. This thesis advocates the use of logic programs to represent declarative and procedural bias, which results in a framework of single-language representation.
We show in this thesis that using a logic program called the top theory as declarative bias leads to a sound and complete multi-clause learning system MC-TopLog. It overcomes the entailment-incompleteness of Progol, thus outperforms Progol in terms of predictive accuracies on learning grammars and strategies for playing Nim game. MC-TopLog has been applied to two real-world applications funded by Syngenta, which is an agriculture company.
A higher-order extension on top theories results in meta-interpreters, which allow the introduction of new predicate symbols. Thus the resulting ILP system Metagol can do predicate invention, which is an intrinsically higher-order logic operation. Metagol also leverages the procedural semantic of Prolog to encode procedural bias, so that it can outperform both its ASP version and ILP systems without an equivalent procedural bias in terms of efficiency and accuracy. This is demonstrated by the experiments on learning Regular, Context-free and Natural grammars. Metagol is also applied to non-grammar learning tasks involving recursion and predicate invention, such as learning a definition of staircases and robot strategy learning. Both MC-TopLog and Metagol are based on a -directed framework, which is different from other multi-clause learning systems based on Inverse Entailment, such as CF-Induction, XHAIL and IMPARO. Compared to another -directed multi-clause learning system TAL, Metagol allows the explicit form of higher-order assumption to be encoded in the form of meta-rules.Open Acces
Constructive approaches to Program Induction
Search is a key technique in artificial intelligence, machine learning and Program Induction. No
matter how efficient a search procedure, there exist spaces that are too large to search effectively
and they include the search space of programs. In this dissertation we show that in the context
of logic-program induction (Inductive Logic Programming, or ILP) it is not necessary to search
for a correct program, because if one exists, there also exists a unique object that is the most
general correct program, and that can be constructed directly, without a search, in polynomial
time and from a polynomial number of examples. The existence of this unique object, that we
term the Top Program because of its maximal generality, does not so much solve the problem
of searching a large program search space, as it completely sidesteps it, thus improving the
efficiency of the learning task by orders of magnitude commensurate with the complexity of a
program space search.
The existence of a unique Top Program and the ability to construct it given finite resources
relies on the imposition, on the language of hypotheses, from which programs are constructed,
of a strong inductive bias with relevance to the learning task. In common practice, in machine
learning, Program Induction and ILP, such relevant inductive bias is selected, or created,
manually, by the human user of a learning system, with intuition or knowledge of the problem
domain, and in the form of various kinds of program templates. In this dissertation we show
that by abandoning the reliance on such extra-logical devices as program templates, and instead
defining inductive bias exclusively as First- and Higher-Order Logic formulae, it is possible to
learn inductive bias itself from examples, automatically, and efficiently, by Higher-Order Top
Program construction.
In Chapter 4 we describe the Top Program in the context of the Meta-Interpretive Learning
approach to ILP (MIL) and describe an algorithm for its construction, the Top Program
Construction algorithm (TPC). We prove the efficiency and accuracy of TPC and describe
its implementation in a new MIL system called Louise. We support theoretical results with
experiments comparing Louise to the state-of-the-art, search-based MIL system, Metagol, and
find that Louise improves Metagol’s efficiency and accuracy. In Chapter 5 we re-frame MIL as
specialisation of metarules, Second-Order clauses used as inductive bias in MIL, and prove that
problem-specific metarules can be derived by specialisation of maximally general metarules, by
MIL. We describe a sub-system of Louise, called TOIL, that learns new metarules by MIL and
demonstrate empirically that the metarules learned by TOIL match those selected manually,
while maintaining the accuracy and efficiency of learning.
iOpen Acces
Relay: A New IR for Machine Learning Frameworks
Machine learning powers diverse services in industry including search,
translation, recommendation systems, and security. The scale and importance of
these models require that they be efficient, expressive, and portable across an
array of heterogeneous hardware devices. These constraints are often at odds;
in order to better accommodate them we propose a new high-level intermediate
representation (IR) called Relay. Relay is being designed as a
purely-functional, statically-typed language with the goal of balancing
efficient compilation, expressiveness, and portability. We discuss the goals of
Relay and highlight its important design constraints. Our prototype is part of
the open source NNVM compiler framework, which powers Amazon's deep learning
framework MxNet
- …