789 research outputs found
kLog: A Language for Logical and Relational Learning with Kernels
We introduce kLog, a novel approach to statistical relational learning.
Unlike standard approaches, kLog does not represent a probability distribution
directly. It is rather a language to perform kernel-based learning on
expressive logical and relational representations. kLog allows users to specify
learning problems declaratively. It builds on simple but powerful concepts:
learning from interpretations, entity/relationship data modeling, logic
programming, and deductive databases. Access by the kernel to the rich
representation is mediated by a technique we call graphicalization: the
relational representation is first transformed into a graph --- in particular,
a grounded entity/relationship diagram. Subsequently, a choice of graph kernel
defines the feature space. kLog supports mixed numerical and symbolic data, as
well as background knowledge in the form of Prolog or Datalog programs as in
inductive logic programming systems. The kLog framework can be applied to
tackle the same range of tasks that has made statistical relational learning so
popular, including classification, regression, multitask learning, and
collective classification. We also report about empirical comparisons, showing
that kLog can be either more accurate, or much faster at the same level of
accuracy, than Tilde and Alchemy. kLog is GPLv3 licensed and is available at
http://klog.dinfo.unifi.it along with tutorials
On Automating the Doctrine of Double Effect
The doctrine of double effect () is a long-studied ethical
principle that governs when actions that have both positive and negative
effects are to be allowed. The goal in this paper is to automate
. We briefly present , and use a first-order
modal logic, the deontic cognitive event calculus, as our framework to
formalize the doctrine. We present formalizations of increasingly stronger
versions of the principle, including what is known as the doctrine of triple
effect. We then use our framework to simulate successfully scenarios that have
been used to test for the presence of the principle in human subjects. Our
framework can be used in two different modes: One can use it to build
-compliant autonomous systems from scratch, or one can use it to
verify that a given AI system is -compliant, by applying a
layer on an existing system or model. For the latter mode, the
underlying AI system can be built using any architecture (planners, deep neural
networks, bayesian networks, knowledge-representation systems, or a hybrid); as
long as the system exposes a few parameters in its model, such verification is
possible. The role of the layer here is akin to a (dynamic or
static) software verifier that examines existing software modules. Finally, we
end by presenting initial work on how one can apply our layer
to the STRIPS-style planning model, and to a modified POMDP model.This is
preliminary work to illustrate the feasibility of the second mode, and we hope
that our initial sketches can be useful for other researchers in incorporating
DDE in their own frameworks.Comment: 26th International Joint Conference on Artificial Intelligence 2017;
Special Track on AI & Autonom
Kolmogorov Complexity in perspective. Part II: Classification, Information Processing and Duality
We survey diverse approaches to the notion of information: from Shannon
entropy to Kolmogorov complexity. Two of the main applications of Kolmogorov
complexity are presented: randomness and classification. The survey is divided
in two parts published in a same volume. Part II is dedicated to the relation
between logic and information system, within the scope of Kolmogorov
algorithmic information theory. We present a recent application of Kolmogorov
complexity: classification using compression, an idea with provocative
implementation by authors such as Bennett, Vitanyi and Cilibrasi. This stresses
how Kolmogorov complexity, besides being a foundation to randomness, is also
related to classification. Another approach to classification is also
considered: the so-called "Google classification". It uses another original and
attractive idea which is connected to the classification using compression and
to Kolmogorov complexity from a conceptual point of view. We present and unify
these different approaches to classification in terms of Bottom-Up versus
Top-Down operational modes, of which we point the fundamental principles and
the underlying duality. We look at the way these two dual modes are used in
different approaches to information system, particularly the relational model
for database introduced by Codd in the 70's. This allows to point out diverse
forms of a fundamental duality. These operational modes are also reinterpreted
in the context of the comprehension schema of axiomatic set theory ZF. This
leads us to develop how Kolmogorov's complexity is linked to intensionality,
abstraction, classification and information system.Comment: 43 page
Running Probabilistic Programs Backwards
Many probabilistic programming languages allow programs to be run under
constraints in order to carry out Bayesian inference. Running programs under
constraints could enable other uses such as rare event simulation and
probabilistic verification---except that all such probabilistic languages are
necessarily limited because they are defined or implemented in terms of an
impoverished theory of probability. Measure-theoretic probability provides a
more general foundation, but its generality makes finding computational content
difficult.
We develop a measure-theoretic semantics for a first-order probabilistic
language with recursion, which interprets programs as functions that compute
preimages. Preimage functions are generally uncomputable, so we derive an
abstract semantics. We implement the abstract semantics and use the
implementation to carry out Bayesian inference, stochastic ray tracing (a rare
event simulation), and probabilistic verification of floating-point error
bounds.Comment: 26 pages, ESOP 2015 (to appear
Structural foundations for differentiable programming
This dissertation supports the broader thesis that categorical semantics is a powerful tool to study and design programming languages.
It focuses on the foundational aspects of differentiable programming in a simply typed functional setting.
Although most of the category theory used can be boiled down to a more elementary presentation, its influence was certainly key in obtaining the results presented in this dissertation. The conciseness of certain proofs and the compactness of certain definitions and insights were made easier thanks to my background in category theory.
Backpropagation is the key algorithm that allows fast learning on neural networks. It enabled some of the impressive recent advancements in machine learning.
With models of increasing complexity, data structures equally complex are required, which calls for the ability to go beyond standard differentiability.
This emerging generalization was coined as differentiable programming.
The idea is to allow users to write expressive programs representing (a generalization of) differentiable functions,
whose gradient computation can be automated using automatic differentiation.
In this dissertation, I lay some foundations for differentiable programming.
This is done in three ways.
Firstly, I present a simple higher-order functional language
and define automatic differentiation as a structure-preserving program transformation.
The language is given a denotational semantics using diffeological spaces,
and it is shown that the transformation is correct, i.e. that AD produces programs that do compute gradients of the original programs, using a logical relations argument.
Secondly, I extend the language from the previously described chapter to introduce new expressive program constructs such as conditionals and recursion.
In such a setting, even first-order programs may represent functions that need not be differentiable. I introduce better-fitted denotational semantics for such a language and show how to extend AD to such a setting and what guarantees about AD now hold.
This extended language models the more realistic needs in expressiveness that can be found in the literature, e.g. in modern probabilistic programming languages.
Thirdly, I present detailed applications of the developed theory. I first show a general recipe for extending AD to non-trivial new types and new primitives. I then show how the guarantees about AD are sufficient for usage in certain applications, such as the change of variable formula of stochastic gradient descent, but how it may not be sufficient, for instance, in simple gradient descent. Finally, more applications in the specific context of probabilistic programming are explored. First, a denotational proof that the trace semantics of a probabilistic program is almost everywhere differentiable is given. Second, a characterization of posterior distributions of probabilistic programs valued in Euclidean spaces is obtained: they have densities with respect to (w.r.t.) some sum-of-Hausdorff measure on a countable union of smooth manifolds.
Overall, these contributions give us better insights into differentiable programming.
They form a foundational setting to study the differentiability-like properties of realistic complex programs, beyond usual settings such as differentiability or convexity. They give general recipes to prove some properties of such programs and modularly extend automatic differentiation to richer contexts with new types and primitives
- …