Search CORE

4 research outputs found

Numerical comparison of least square-based finite-difference (LSFD) and radial basis function-based finite-difference (RBFFD) methods

Author: Atluri
Babuska
Belytschko
C. Shu
Chen
Cheng
Ding
Driscoll
Dubal
Fasshauer
Fornberg
Ghia
H. Ding
Hon
Kansa
Kansa
Liszka
Liu
Lyche
Madych
Micchelli
N. Zhao
Orate
Schaback
Shu
Shu
Publication venue: 'Elsevier BV'
Publication date: 01/04/2006
Field of study

10.1016/j.camwa.2006.04.015Computers and Mathematics with Applications518 SPEC. ISS.1297-1310CMAP

Elsevier - Publisher Connector

Crossref

ScholarBank@NUS

A Survey on Policy Search for Robotics

Author: Deisenroth MP
Neumann G
Peters J
Publication venue: 'Now Publishers'
Publication date: 01/01/2011
Field of study

Policy search is a subfield in reinforcement learning which focuses on finding good parameters for a given policy parametrization. It is well suited for robotics as it can cope with high-dimensional state and action spaces, one of the main challenges in robot learning. We review recent successes of both model-free and model-based policy search in robot learning. Model-free policy search is a general approach to learn policies based on sampled trajectories. We classify model-free methods based on their policy evaluation strategy, policy update strategy, and exploration strategy and present a unified view on existing algorithms. Learning a policy is often easier than learning an accurate forward model, and, hence, model-free methods are more frequently used in practice. However, for each sampled trajectory, it is necessary to interact with the * Both authors contributed equally. robot, which can be time consuming and challenging in practice. Modelbased policy search addresses this problem by first learning a simulator of the robot’s dynamics from data. Subsequently, the simulator generates trajectories that are used for policy learning. For both modelfree and model-based policy search methods, we review their respective properties and their applicability to robotic systems

University of Lincoln Institutional Repository

TUbiblio

Crossref

Spiral - Imperial College Digital Repository

MPG.PuRe

Recommended from our members

High-Dimensional Reinforcement Learning with Human Feedback

Author: Curran William
Publication venue: 'Oregon State University'
Publication date
Field of study

State-of-the-art personal robots need to perform complex manipulation tasks to be viable in complex scenarios. However, many of these robots, like the PR2, use manipulators with high degrees of freedom. High degrees of freedom are desirable from a functionality standpoint, but make the learning task more difficult by adding a high-dimensional state space. The problem is made worse in bimanual manipulation tasks. Our proposed approach is to scale existing reinforcement learning techniques to learn in high-dimensional robot control problems. We propose reducing the state space by using demonstrations to discover a representative low-dimensional manifold in which to learn. This allows the agent to converge quickly to a good policy. We call this Dimensionality-Reduced Reinforcement Learning (DRRL). However, when performing dimensionality reduction, sometimes important state information is lost. We extend this work by first learning in a single dimension, and then transferring that knowledge to a higher-dimensional space. By using our Iterative DRRL (IDRRL) framework with an existing learning algorithm, the agent converges quickly to a better policy by iterating to increasingly higher dimensions. IDRRL is robust to demonstration quality and can learn efficiently using few demonstrations. We use Principal Component Analysis (PCA) for our linear dimensionality reduction in DRRL and IDRRL. However, linear dimensionality reduction assumes that the underlying data can be represented by a lower dimension linear subspace. Robot state spaces typically include velocities and accelerations, whose equations of motion are inherently nonlinear. Standard linear dimensionality reduction techniques cannot accurately represent complex nonlinear structures. However, nonlinear dimensionality reduction techniques are too computationally complex to use online. To overcome these limitations, we introduce a novel approach to dimensionality reduction based on a system of cascading autoencoders (CAE), producing the new algorithm IDRRL-CAE. Optimization is useful, but fast learning doesn't help if the objective function is deceptive or difficult to define mathematically. In many cases, roboticists may not be able to predict all scenarios their robots may experience, and thus cannot design an objective function for every case apriori. In these situations it may be helpful to incorporate human feedback. To give effective feedback, users need an interface that is intuitive, time insensitive, and incorporates both fine-grained and coarse feedback. To incorporate human feedback in our learning, we use timeline interfaces. Timeline interfaces that allow you to move backward and forward through a video have been used by video editors for years. They are simple and designed for both non-experts and video editing experts. These interfaces allow a user to cut, concatenate, rewind, fast forward, and perform many other tasks on videos. They speed up the editing process by decoupling the timescale of the editing process from the timescale of the video being edited. These same concepts can be used in human feedback mechanisms for robot control systems. Current human feedback mechanisms require the user to quickly respond to robot actions, work in only discrete spaces, or only allow for either coarse or detailed feedback. The timeline interface paradigm naturally accounts for fine-grained state spaces, does not require quick human feedback, allows the user to make both coarse and fine-grained edits to video, and decouples the speed of the video from the speed of feedback. In this dissertation we present a proof-of-concept movie reel interface that uses this timeline interface paradigm

ScholarsArchive@OSU