2 research outputs found
Off-policy reinforcement learning with Gaussian processes
An off-policy Bayesian nonparameteric approximate reinforcement learning framework, termed as GPQ, that employs a Gaussian processes (GP) model of the value (Q) function is presented in both the batch and online settings. Sufficient conditions on GP hyperparameter selection are established to guarantee convergence of off-policy GPQ in the batch setting, and theoretical and practical extensions are provided for the online case. Empirical results demonstrate GPQ has competitive learning speed in addition to its convergence guarantees and its ability to automatically choose its own bases locations.United States. Office of Naval Research (Autonomy Program N000140910625
IEEE Transactions On Neural Networks And Learning Systems : Vol. 24, No. 12, December 2013
Canonical Correlation Analysis on Data With Censoring and Error Information - J. Sun and S. Keates
Highly Accurate Moving Object Detection in Variable Bit Rate Video-Based Traffic Monitoring Systems - S. -C. Huang and B. -H. Chen
Recurrent Neural Collective Classification - D. D. Monner and J. A. Reggia
Online Selective Kernel-Based Temporal Difference Learning - X. Chen, Y. Gao, and R. Wang
Stability and Synchronization of Discrete-Time Neural Network With Switching Parameters, and Time-Varying Delays - L. Wu, Z. Feng, and J. Lam
Artificial Endocrine Controlller for Power Management in Robotic Systems C. Sauze and M. Neal
Operator Control of Interneural Computing Machines - M. -H. Shih and F. -S. Tsai
Multiple Graph Label Propagation by Sparse Integration - M. Karasuyama and H. Mamitsuka
Universal Blind Image Quality Assessment Metrics Via Natural Scene Statistics and Multiple Kernel Learning - X. Gao, F. Gao, D. Tao, and X. Li
H State Estimation for Complex Networks With Uncertain Inner Coupling and Incomplete Measurements - B. Shen, Z. wang, D. Ding, and H. Shu
Goal Representation Heuristic Dynamic Programming on Maze Navigation - Z. Ni, H. He, J. Wen, and X. Xu
Accelerated Canonical Polyadic Decomposition Using Mode Reduction - G. Zhou, A. Cichocki, and S. Xie
Hardware Friendly Probabilistic spiking Neural Network With Long-Term and Short - Term Plasticity - H. -Y. Hsieh and K. -T. Tang
Neural Network Architecture for Cognitive Navigation in Dynamic Environments - J. A. Villacorta - Atienza and V. A. Makarov
An Equivalence Between Adaptive Dynamic Programming With a Critic and Backpropagation Through Time - M. Fairbank, E. Alonso, and D. Prokhorov
Semisupervised Multitask Learning With Gaussian Processes - G Skolidis and G. Sanguinetti
BRIEF PAPERS
Nonlinear Projection Trick in Kernel Methods : An Alternative to the Kernel Trick - N. Kwak
ANNOUNCEMENTS
IEEE WCCI 2014
Etc