38,927 research outputs found
ZOOpt: Toolbox for Derivative-Free Optimization
Recent advances of derivative-free optimization allow efficient approximating
the global optimal solutions of sophisticated functions, such as functions with
many local optima, non-differentiable and non-continuous functions. This
article describes the ZOOpt (https://github.com/eyounx/ZOOpt) toolbox that
provides efficient derivative-free solvers and are designed easy to use. ZOOpt
provides a Python package for single-thread optimization, and a light-weighted
distributed version with the help of the Julia language for Python described
functions. ZOOpt toolbox particularly focuses on optimization problems in
machine learning, addressing high-dimensional, noisy, and large-scale problems.
The toolbox is being maintained toward ready-to-use tool in real-world machine
learning tasks
Differentiable Logics for Neural Network Training and Verification
The rising popularity of neural networks (NNs) in recent years and their
increasing prevalence in real-world applications have drawn attention to the
importance of their verification. While verification is known to be
computationally difficult theoretically, many techniques have been proposed for
solving it in practice. It has been observed in the literature that by default
neural networks rarely satisfy logical constraints that we want to verify. A
good course of action is to train the given NN to satisfy said constraint prior
to verifying them. This idea is sometimes referred to as continuous
verification, referring to the loop between training and verification. Usually
training with constraints is implemented by specifying a translation for a
given formal logic language into loss functions. These loss functions are then
used to train neural networks. Because for training purposes these functions
need to be differentiable, these translations are called differentiable logics
(DL). This raises several research questions. What kind of differentiable
logics are possible? What difference does a specific choice of DL make in the
context of continuous verification? What are the desirable criteria for a DL
viewed from the point of view of the resulting loss function? In this extended
abstract we will discuss and answer these questions.Comment: FOMLAS'22 pape
On Differentiable Interpreters
Neural networks have transformed the fields of Machine Learning and Artificial Intelligence with the ability to model complex features and behaviours from raw data. They quickly became instrumental models, achieving numerous state-of-the-art performances across many tasks and domains. Yet the successes of these models often rely on large amounts of data. When data is scarce, resourceful ways of using background knowledge often help. However, though different types of background knowledge can be used to bias the model, it is not clear how one can use algorithmic knowledge to that extent. In this thesis, we present differentiable interpreters as an effective framework for utilising algorithmic background knowledge as architectural inductive biases of neural networks. By continuously approximating discrete elements of traditional program interpreters, we create differentiable interpreters that, due to the continuous nature of their execution, are amenable to optimisation with gradient descent methods. This enables us to write code mixed with parametric functions, where the code strongly biases the behaviour of the model while enabling the training of parameters and/or input representations from data. We investigate two such differentiable interpreters and their use cases in this thesis. First, we present a detailed construction of ∂4, a differentiable interpreter for the programming language FORTH. We demonstrate the ability of ∂4 to strongly bias neural models with incomplete programs of variable complexity while learning missing pieces of the program with parametrised neural networks. Such models can learn to solve tasks and strongly generalise to out-of-distribution data from small datasets. Second, we present greedy Neural Theorem Provers (gNTPs), a significant improvement of a differentiable Datalog interpreter NTP. gNTPs ameliorate the large computational cost of recursive differentiable interpretation, achieving drastic time and memory speedups while introducing soft reasoning over logic knowledge and natural language
Higher-order chain rules for tensor fields, generalized Bell polynomials, and estimates in Orlicz-Sobolev-Slobodeckij and bounded variation spaces
We describe higher-order chain rules for multivariate functions and tensor
fields. We estimate Sobolev-Slobodeckij norms, Musielak-Orlicz norms, and the
total variation seminorms of the higher derivatives of tensor fields after a
change of variables and determine sufficient regularity conditions for the
coordinate change. We also introduce a novel higher-order chain rule for
composition chains of multivariate functions that is described via nested set
partitions and generalized Bell polynomials; it is a natural extension of the
Fa\`a di Bruno formula. Our discussion uses the coordinate-free language of
tensor calculus and includes Fr\'echet-differentiable mappings between Banach
spaces.Comment: Submitte
- …