Search CORE

38,183 research outputs found

?????? ?????? ??????????????? ?????? ????????????

Author: Kim Sol A
Publication venue: Graduate School of UNIST
Publication date: 01/02/2020
Field of study

Department of Computer Science and EngineeringRecently deep reinforcement learning (DRL) algorithms show super human performances in the simulated game domains. In practical points, the sample efficiency is also one of the most important measures to determine the performance of a model. Especially for the environment of large search spaces (e.g. continuous action space), it is very critical condition to achieve the state-of-the-art performance. In this thesis, we design a model to be applicable to multi-end games in continuous space with high sample efficiency. A multi-end game has several sub-games which are independent each other but affect the result of the game by some rules of its domain. We verify the algorithm in the environment of simulated curling.clos

ScholarWorks@UNIST

On the Design of LQR Kernels for Efficient Controller Learning

Author: Hennig Philipp
Marco Alonso
Schaal Stefan
Trimpe Sebastian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Finding optimal feedback controllers for nonlinear dynamic systems from data is hard. Recently, Bayesian optimization (BO) has been proposed as a powerful framework for direct controller tuning from experimental trials. For selecting the next query point and finding the global optimum, BO relies on a probabilistic description of the latent objective function, typically a Gaussian process (GP). As is shown herein, GPs with a common kernel choice can, however, lead to poor learning outcomes on standard quadratic control problems. For a first-order system, we construct two kernels that specifically leverage the structure of the well-known Linear Quadratic Regulator (LQR), yet retain the flexibility of Bayesian nonparametric learning. Simulations of uncertain linear and nonlinear systems demonstrate that the LQR kernels yield superior learning performance.Comment: 8 pages, 5 figures, to appear in 56th IEEE Conference on Decision and Control (CDC 2017

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Sequential Design for Ranking Response Surfaces

Author: Hu Ruimeng
Ludkovski Mike
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 12/07/2016
Field of study

We propose and analyze sequential design methods for the problem of ranking several response surfaces. Namely, given

L \ge 2

response surfaces over a continuous input space

\cal X

, the aim is to efficiently find the index of the minimal response across the entire

\cal X

. The response surfaces are not known and have to be noisily sampled one-at-a-time. This setting is motivated by stochastic control applications and requires joint experimental design both in space and response-index dimensions. To generate sequential design heuristics we investigate stepwise uncertainty reduction approaches, as well as sampling based on posterior classification complexity. We also make connections between our continuous-input formulation and the discrete framework of pure regret in multi-armed bandits. To model the response surfaces we utilize kriging surrogates. Several numerical examples using both synthetic data and an epidemics control problem are provided to illustrate our approach and the efficacy of respective adaptive designs.Comment: 26 pages, 7 figures (updated several sections and figures

arXiv.org e-Print Archive

eScholarship - University of California

Learning from Distributions via Support Measure Machines

Author: Dinuzzo Francesco
Fukumizu Kenji
Muandet Krikamol
Schölkopf Bernhard
Publication venue
Publication date: 01/01/2012
Field of study

This paper presents a kernel-based discriminative learning framework on probability measures. Rather than relying on large collections of vectorial training examples, our framework learns using a collection of probability distributions that have been constructed to meaningfully represent training data. By representing these probability distributions as mean embeddings in the reproducing kernel Hilbert space (RKHS), we are able to apply many standard kernel-based learning techniques in straightforward fashion. To accomplish this, we construct a generalization of the support vector machine (SVM) called a support measure machine (SMM). Our analyses of SMMs provides several insights into their relationship to traditional SVMs. Based on such insights, we propose a flexible SVM (Flex-SVM) that places different kernel functions on each training example. Experimental results on both synthetic and real-world data demonstrate the effectiveness of our proposed framework.Comment: Advances in Neural Information Processing Systems 2

arXiv.org e-Print Archive

CiteSeerX

MPG.PuRe

Adaptive Multiple Importance Sampling for Gaussian Processes

Author: Filippone Maurizio
Xiong Xiaoyu
Šmídl Václav
Publication venue
Publication date: 31/03/2016
Field of study

In applications of Gaussian processes where quantification of uncertainty is a strict requirement, it is necessary to accurately characterize the posterior distribution over Gaussian process covariance parameters. Normally, this is done by means of standard Markov chain Monte Carlo (MCMC) algorithms. Motivated by the issues related to the complexity of calculating the marginal likelihood that can make MCMC algorithms inefficient, this paper develops an alternative inference framework based on Adaptive Multiple Importance Sampling (AMIS). This paper studies the application of AMIS in the case of a Gaussian likelihood, and proposes the Pseudo-Marginal AMIS for non-Gaussian likelihoods, where the marginal likelihood is unbiasedly estimated. The results suggest that the proposed framework outperforms MCMC-based inference of covariance parameters in a wide range of scenarios and remains competitive for moderately large dimensional parameter spaces.Comment: 27 page

arXiv.org e-Print Archive

Enlighten