25 research outputs found
d(Tree)-by-dx : automatic and exact differentiation of genetic programming trees
Genetic programming (GP) has developed to the point where it is a credible candidate for the `black box' modeling of real systems. Wider application, however, could greatly benefit from its seamless embedding in conventional optimization schemes, which are most efficiently carried out using gradient-based methods. This paper describes the development of a method to automatically differentiate GP trees using a series of tree transformation rules; the resulting method can be applied an unlimited number of times to obtain higher derivatives of the function approximated by the original, trained GP tree. We demonstrate the utility of our method using a number of illustrative gradient-based optimizations that embed GP models
Alpha MAML: Adaptive model-agnostic meta-learning
Model-agnostic meta-learning (MAML) is a meta-learning technique to train a model on a multitude of learning tasks in a way that primes the model for few-shot learning of new tasks. The MAML algorithm performs well on few-shot learning problems in classification, regression, and fine-tuning of policy gradients in reinforcement learning, but comes with the need for costly hyperparameter tuning for training stability. We address this shortcoming by introducing an extension to MAML, called Alpha MAML, to incorporate an online hyperparameter adaptation scheme that eliminates the need to tune meta-learning and learning rates. Our results with the Omniglot database demonstrate a substantial reduction in the need to tune MAML training hyperparameters and improvement to training stability with less sensitivity to hyperparameter choice
Inference compilation and universal probabilistic programming
We introduce a method for using deep neural networks to amortize the cost of inference in models from the family induced by universal probabilistic programming languages, establishing a framework that combines the strengths of probabilistic programming and deep learning methods. We call what we do “compilation of inference” because our method transforms a denotational specification of an inference problem in the form of a probabilistic program written in a universal programming language into a trained neural network denoted in a neural network specification language. When at test time this neural network is fed observational data and executed, it performs approximate inference in the original model specified by the probabilistic program. Our training objective and learning procedure are designed to allow the trained neural network to be used as a proposal distribution in a sequential importance sampling inference engine. We illustrate our method on mixture models and Captcha solving and show significant speedups in the efficiency of inference
KL guided domain adaptation
Domain adaptation is an important problem and often needed for real-world applications. In this problem, instead of i.i.d. training and testing datapoints, we assume that the source (training) data and the target (testing) data have different distributions. With that setting, the empirical risk minimization training procedure often does not perform well, since it does not account for the change in the distribution. A common approach in the domain adaptation literature is to learn a representation of the input that has the same (marginal) distribution over the source and the target domain. However, these approaches often require additional networks and/or optimizing an adversarial (minimax) objective, which can be very expensive or unstable in practice. To improve upon these marginal alignment techniques, in this paper, we first derive a generalization bound for the target loss based on the training loss and the reverse Kullback-Leibler (KL) divergence between the source and the target representation distributions. Based on this bound, we derive an algorithm that minimizes the KL term to obtain a better generalization to the target domain. We show that with a probabilistic representation network, the KL term can be estimated efficiently via minibatch samples without any additional network or a minimax objective. This leads to a theoretically sound alignment method which is also very efficient and stable in practice. Experimental results also suggest that our method outperforms other representation-alignment approaches
KL guided domain adaptation
Domain adaptation is an important problem and often needed for real-world applications. In this problem, instead of i.i.d. training and testing datapoints, we assume that the source (training) data and the target (testing) data have different distributions. With that setting, the empirical risk minimization training procedure often does not perform well, since it does not account for the change in the distribution. A common approach in the domain adaptation literature is to learn a representation of the input that has the same (marginal) distribution over the source and the target domain. However, these approaches often require additional networks and/or optimizing an adversarial (minimax) objective, which can be very expensive or unstable in practice. To improve upon these marginal alignment techniques, in this paper, we first derive a generalization bound for the target loss based on the training loss and the reverse Kullback-Leibler (KL) divergence between the source and the target representation distributions. Based on this bound, we derive an algorithm that minimizes the KL term to obtain a better generalization to the target domain. We show that with a probabilistic representation network, the KL term can be estimated efficiently via minibatch samples without any additional network or a minimax objective. This leads to a theoretically sound alignment method which is also very efficient and stable in practice. Experimental results also suggest that our method outperforms other representation-alignment approaches
Identification of lack of knowledge using analytical redundancy applied to structural dynamic systems
Reliability of sensor information in today’s highly automated systems is crucial. Neglected and not quantifiable uncertainties lead to lack of knowledge which results in erroneous interpretation of sensor data. Physical redundancy is an often-used approach to reduce the impact of lack of knowledge but in many cases is infeasible and gives no absolute certainty about which sensors and models to trust. However, structural models can link spatially distributed sensors to create analytical redundancy. By using existing sensor data and models, analytical redundancy comes with the benefits of unchanged structural behavior and cost efficiency. The detection of conflicting data using analytical redundancy reveals lack of knowledge, e.g. in sensors or models, and supports the inference from conflict to cause. We present an approach to enforce analytical redundancy by using an information model of the technical system formalizing sensors, physical models and the corresponding uncertainty in a unified framework. This allows for continuous validation of models and the verification of sensor data. This approach is applied to a structural dynamic system with various sensors based on an aircraft landing gear system