Search CORE

146 research outputs found

Constructing Abstraction Hierarchies Using a Skill-Symbol Loop

Author: Konidaris George
Publication venue
Publication date: 25/09/2015
Field of study

We describe a framework for building abstraction hierarchies whereby an agent alternates skill- and representation-acquisition phases to construct a sequence of increasingly abstract Markov decision processes. Our formulation builds on recent results showing that the appropriate abstract representation of a problem is specified by the agent's skills. We describe how such a hierarchy can be used for fast planning, and illustrate the construction of an appropriate hierarchy for the Taxi domain

arXiv.org e-Print Archive

CiteSeerX

Hybrid Bayesian Eigenobjects: Combining Linear Subspace and Deep Network Methods for 3D Robot Vision

Author: Burchfiel Benjamin
Konidaris George
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/07/2018
Field of study

We introduce Hybrid Bayesian Eigenobjects (HBEOs), a novel representation for 3D objects designed to allow a robot to jointly estimate the pose, class, and full 3D geometry of a novel object observed from a single viewpoint in a single practical framework. By combining both linear subspace methods and deep convolutional prediction, HBEOs efficiently learn nonlinear object representations without directly regressing into high-dimensional space. HBEOs also remove the onerous and generally impractical necessity of input data voxelization prior to inference. We experimentally evaluate the suitability of HBEOs to the challenging task of joint pose, class, and shape inference on novel objects and show that, compared to preceding work, HBEOs offer dramatically improved performance in all three tasks along with several orders of magnitude faster runtime performance.Comment: To appear in the International Conference on Intelligent Robots (IROS) - Madrid, 201

arXiv.org e-Print Archive

Crossref

Hidden Parameter Markov Decision Processes: A Semiparametric Regression Approach for Discovering Latent Task Parametrizations

Author: Doshi-Velez Finale
Konidaris George
Publication venue
Publication date: 15/08/2013
Field of study

Control applications often feature tasks with similar, but not identical, dynamics. We introduce the Hidden Parameter Markov Decision Process (HiP-MDP), a framework that parametrizes a family of related dynamical systems with a low-dimensional set of latent factors, and introduce a semiparametric regression approach for learning its structure from data. In the control setting, we show that a learned HiP-MDP rapidly identifies the dynamics of a new task instance, allowing an agent to flexibly adapt to task variations

arXiv.org e-Print Archive

CiteSeerX

Reinforcement Learning with Parameterized Actions

Author: Konidaris George
Masson Warwick
Ranchod Pravesh
Publication venue
Publication date: 26/11/2015
Field of study

We introduce a model-free algorithm for learning in Markov decision processes with parameterized actions-discrete actions with continuous parameters. At each step the agent must select both which action to use and which parameters to use with that action. We introduce the Q-PAMDP algorithm for learning in these domains, show that it converges to a local optimum, and compare it to direct policy search in the goal-scoring and Platform domains.Comment: Accepted for AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Evolving Neural Networks for the Capture Game

Author: Konidaris George
Oren Nir
Shell Dylan
Publication venue
Publication date: 01/01/2002
Field of study

Postprin

Aberdeen University Research

CiteSeerX

Learning Parameterized Skills

Author: Barto Andrew
Da Silva Bruno
Konidaris George
Publication venue
Publication date: 01/01/2012
Field of study

We introduce a method for constructing skills capable of solving tasks drawn from a distribution of parameterized reinforcement learning problems. The method draws example tasks from a distribution of interest and uses the corresponding learned policies to estimate the topology of the lower-dimensional piecewise-smooth manifold on which the skill policies lie. This manifold models how policy parameters change as task parameters vary. The method identifies the number of charts that compose the manifold and then applies non-linear regression in each chart to construct a parameterized skill by predicting policy parameters from task parameters. We evaluate our method on an underactuated simulated robotic arm tasked with learning to accurately throw darts at a parameterized target location.Comment: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012

arXiv.org e-Print Archive

CiteSeerX