Search CORE

14 research outputs found

Vision-Based Learning for Real Robot: Towards RoboCup

Author: C. Mishima
E. Uchibe
M. Asada
M. Nakamura
S. Suzuki
Suzuki Takahashi
Y. Takahashi
Publication venue
Publication date
Field of study

The authors have applied reinforcement learning method to a mobile robot with vision system. We selected a task for the robot from skills for playing soccer. In the rst stage, a robot learned to shoot a ball into a goal. In the second stage , we set up an opponent just before the goal, that is, a goal keeper, and make the robot learn to shoot a ball into a goal avoiding the goal keeper. This paper describes several research issues for RoboCup with real robots along with our research projects. 1 Introduction Building robots that learn to perform a task has been acknowledged as one of the major challenges facing AI and Robotics. Reinforcement learning has recently been receiving increased attention as a method for robot learning with little or no a priori knowledge and higher capability of reactive and adaptive behaviors [5]. In the reinforcement learning scheme, a robot and an environment are modelled by two synchronised nite state automatons interacting in discrete time cyclical pro..

CiteSeerX

The Cyber Rodent Project: Exploration of Adaptive Mechanisms for Self-Preservation and Self-Reproduction

Author: Capi G.
Capi G.
Dietterich T. G.
Doncieux S.
Doya K.
Eiji Uchibe
Elfwing S.
Elfwing S.
Eriksson A.
Kenji Doya
Nehmzow U.
Ng A.Y.
Spier E.
Sutton R.
Uchibe E.
Usui Y.
Publication venue: 'SAGE Publications'
Publication date
Field of study

Crossref

Vision-Based Robot Learning Towards RoboCup: Osaka University &quot;Trackies&quot;

Author: C. Mishima
E. Uchibe
H. Ishizuka
M. Asada
M. Nakamura
S. Suzuki
T. Kato
Y. Takahashi
Publication venue
Publication date
Field of study

. The authors have applied reinforcement learning methods to real robot tasks in several aspects. We selected a skill of soccer as a task for a vision-based mobile robot. In this paper, we explain two of our method; (1)learning a shooting behavior, and (2)learning a shooting with avoiding an opponent. These behaviors were obtained by a robot in simulation and tested in a real environment in RoboCup-97. We discuss current limitations and future work along with the results of RoboCup97. 1 Introduction Building robots that learn to perform a task in a real world has been acknowledged as one of the major challenges facing AI and Robotics. Reinforcement learning has recently been receiving increased attention as a method for robot learning with little or no a priori knowledge and higher capability of reactive and adaptive behaviors [3]. In the reinforcement learning scheme, a robot and an environment are modeled by two synchronized nite state automatons interacting in discrete time cyclic..

CiteSeerX

Digital Object Identifier: 10.1177/1059712310397633In this article we propose a framework for performing embodied evolution with a limited number of robots, by utilizing time-sharing in subpopulations of virtual agents hosted in each robot. Within this framework, we explore the combination of within-generation learning of basic survival behaviors by reinforcement learning, and evolutionary adaptations over the generations of the basic behavior selection policy, the reward functions, and metaparameters for reinforcement learning. We apply a biologically inspired selection scheme, in which there is no explicit communication of the individuals' fitness information. The individuals can only reproduce offspring by mating—a pair-wise exchange of genotypes—and the probability that an individual reproduces offspring in its own subpopulation is dependent on the individual’s “health,” that is, energy level, at the mating occasion. We validate the proposed method by comparing it with evolution using standard centralized selection, in simulation, and by transferring the obtained solutions to hardware using two real robots

Scholarly Materials And Research @ Georgia Tech

Crossref

Reward-Weighted Regression with Sample Reuse for Direct Policy Search in Reinforcement Learning

Author: Bagnell J. A.
Dempster A. P.
Hirotaka Hachiya
Jan Peters
Kakade S.
Kober J.
Masashi Sugiyama
Peshkin L.
Precup D.
Schaal S.
Shelton C. R.
Sugiyama M.
Sugiyama M.
Sutton R. S.
Sutton R. S.
Uchibe E.
Publication venue: 'MIT Press - Journals'
Publication date
Field of study

Crossref

Competitive co-evolution of multi-layer perceptron classifiers

Author: A Azzini
D Rumelhart
D Thierens
DB Fogel
DB Fogel
DT Pham
E Popovici
E Uchibe
F Aboitiz
F Gomez
FK Boer de
GM Weiss
J Paredis
J Paredis
JB Pollack
M Castellani
MA Potter
Marco Castellani
N García-Pedrajas
NV Chawla
RP Lippmann
S Haykin
S Nolfi
S Nolfi
WD Hillis
X Yao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/04/2017
Field of study

This paper analyses the competitive approach to the co-evolutionary training of multi-layer perceptron classifiers. Two algorithms were tested: the first opposes a population of classifiers to a population of training patterns; the second pits a population of classifiers against a population of subsets of training patterns. The classifiers are regarded as predators that need to ‘capture’ (correctly categorise) the prey (training patterns). Success for the predators is measured on their ability to capture prey. Success for the prey is measured on their ability to escape predation (be misclassified). The aim of the procedure is to create an evolutionary tug-of-war between the best classifiers and the most difficult data samples, increasing the efficiency and accuracy of the learning process. The two co-evolutionary algorithms were tested on a number of well-known benchmarks and on several artificial data sets modelling different kinds of common classification problems such as overlapping data categories, noisy training inputs, and unbalanced data classes. The performance of the co-evolutionary methods was compared with that of two traditional training techniques: the standard backpropagation rule and a conventional evolutionary algorithm. The co-evolutionary procedures achieved top accuracy in all classification problems. They particularly excelled on data sets containing noisy training inputs, where they outperformed the backpropagation rule, and on tasks involving unbalanced data classes, where they outperformed both backpropagation and the conventional evolutionary algorithm. Compared to the standard evolutionary algorithm, the co-evolutionary procedures were able to obtain similar or superior learning accuracies, whilst needing considerably less presentations of the training patterns. This economy in the use of training patterns translated into significant savings in computational overheads and algorithms running tim

Crossref

University of Birmingham Research Portal

Efficient Sample Reuse in Policy Gradients with Parameter-Based Exploration

Author: Bagnell J.
Cortes C.
Dayan P.
Greensmith E.
Hirotaka Hachiya
Jun Morimoto
Kakade S.
Masashi Sugiyama
Matsubara T.
Meuleau N.
Ng A.
Peshkin L.
Precup D.
Ross S. M.
Shelton C. R.
Sutton R. S.
Sutton R. S.
Tingting Zhao
Uchibe E.
Voot Tangkaratt
Weaver L.
Williams R. J.
Publication venue: 'MIT Press - Journals'
Publication date
Field of study

Crossref