Search CORE

3 research outputs found

The Asymptotic Equipartition Property in Reinforcement Learning and its Relation to Return Maximization

Author: IKEDA Kazushi
IWATA Kazunori
SAKAI Hideaki
Publication venue: 'Elsevier BV'
Publication date: 28/02/2023
Field of study

Institutional Repositories DataBase (IRDB)

The Asymptotic Equipartition Property in Reinforcement Learning and its Relation to Return Maximization

Author: Hideaki Sakai
Kazunori Iwata
Kazushi Ikeda
Publication venue
Publication date
Field of study

We discuss an important property called the asymptotic equipartition property on empirical sequences in reinforcement learning. This states that the typical set of empirical sequences has probability nearly one, that all elements in the typical set are nearly equi-probable, and that the number of elements in the typical set is an exponential function of the sum of conditional entropies if the number of time steps is sufficiently large. The sum is referred to as stochastic complexity. Using the property we elucidate the fact that the return maximization depends on two factors, the stochastic complexity and a quantity depending on the parameters of environment. Here, the return maximization means that the best sequences in terms of expected return have probability one. We also examine the sensitivity of stochastic complexity, which is a qualitative guide in tuning the parameters of action-selection strategy, and show a sufficient condition for return maximization in probability

CiteSeerX

A sensory system for robots using evolutionary artificial neural networks.

Author: Reddipogu Ann
Publication venue
Publication date: 31/08/2006
Field of study

The thesis presents the research involved with developing an Intelligent Vision System for an animat that can analyse a visual scene in uncontrolled environments. Inspiration was drawn both from Biological Visual Systems and Artificial Image Recognition Systems. Several Biological Systems including the Insect, Toad and Human Visual Systems were studied alongside popular Pattern Recognition Systems such as fully connected Feedforward Networks, Modular Neural Networks and the Neocognitron. The developed system, called the Distributed Neural Network (DNN) was based on the sensory-motor connections in the common toad, Bufo Bufo. The sparsely connected network architecture has features of modularity enhanced by the presence of lateral inhibitory connections. It was implemented using Evolutionary Artificial Neural Networks (EANN). A novel method called FUSION was used to train the DNN, which is an amalgamation of several concepts of learning in Artificial Neural Networks such as Unsupervised Learning, Supervised Learning, Reinforcement Learning, Competitive Learning, Self-organisation and Fuzzy Logic. The DNN has unique feature detecting capabilities. When the DNN was tested using images that comprised of combination of features used in the training set, the DNN was successful in recognising individual features. The combinations of features were never used in the training set. This is a unique feature of the DNN trained using Fusion that cannot be matched by any other popular ANN architecture or training method. The system proved to be robust in dealing with New and Noisy Images. The unique features of the DNN make the network suitable for applications in robotics such as obstacle avoidance and terrain recognition, where the environment is unpredictable. The network can also be used in the field of Medical Imaging, Biometrics (Face and Finger Print Recognition) and Quality Inspection in the Food Processing Industry and applications in other uncontrolled environments

Open Access Institutional Repository at Robert Gordon University