72 research outputs found

    Reinforcement Learning: A Survey

    Full text link
    This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.Comment: See http://www.jair.org/ for any accompanying file

    Data driven modeling using reinforcement learning in autonomous agents

    Get PDF
    Thesis (Master)--Izmir Institute of Technology, Mechanical Engineering, Izmir, 2003Includes bibliographical references (leaves: 61-66)Text in English; Abstract: Turkish and Englishvi, 75 leavesThis research has aspired to build a system which is capable of solving problems by means of its past experience, especially an autonomous agent that can learn from trial and error sequences. To achieve this, connectionist neural network architectures are combined with the reinforcement learning methods. And the credit assignment problem in multi layer perceptron (MLP) architectures is altered. In classical credit assignment problems, actual output of the system and the previously known data in which the system tries to approximate are compared and the discrepancy between them is attempted to be minimized. However, temporal difference credit assignment depends on the temporary successive outputs. By this new method, it is more feasible to find the relation between each event rather than their consequences.Also in this thesis k-means algorithm is modified. Moreover MLP architectures is written in C++ environment, like Backpropagation, Radial Basis Function Networks, Radial Basis Function Link Net, Self-organized neural network, k-means algorithm.And with their combination for the Reinforcement learning, temporal difference learning, and Q-learning architectures were realized, all these algorithms are simulated, and these simulations are created in C++ environment.As a result, reinforcement learning methods used have two main disadvantages during the process of creating autonomous agent. Firstly its training time is too long, and too many input parameters are needed to train the system. Hence it is seen that hardware implementation is not feasible yet. Further research is considered necessary

    Generic Online Learning for Partial Visible & Dynamic Environment with Delayed Feedback

    Get PDF
    Reinforcement learning (RL) has been applied to robotics and many other domains which a system must learn in real-time and interact with a dynamic environment. In most studies the state- action space that is the key part of RL is predefined. Integration of RL with deep learning method has however taken a tremendous leap forward to solve novel challenging problems such as mastering a board game of Go. The surrounding environment to the agent may not be fully visible, the environment can change over time, and the feedbacks that agent receives for its actions can have a fluctuating delay. In this paper, we propose a Generic Online Learning (GOL) system for such environments. GOL is based on RL with a hierarchical structure to form abstract features in time and adapt to the optimal solutions. The proposed method has been applied to load balancing in 5G cloud random access networks. Simulation results show that GOL successfully achieves the system objectives of reducing cache-misses and communication load, while incurring only limited system overhead in terms of number of high-level patterns needed. We believe that the proposed GOL architecture is significant for future online learning of dynamic, partially visible environments, and would be very useful for many autonomous control systems

    Reinforcement Learning For The Control Of Large-Scale Power Systems

    Get PDF

    Machine learning and its applications in reliability analysis systems

    Get PDF
    In this thesis, we are interested in exploring some aspects of Machine Learning (ML) and its application in the Reliability Analysis systems (RAs). We begin by investigating some ML paradigms and their- techniques, go on to discuss the possible applications of ML in improving RAs performance, and lastly give guidelines of the architecture of learning RAs. Our survey of ML covers both levels of Neural Network learning and Symbolic learning. In symbolic process learning, five types of learning and their applications are discussed: rote learning, learning from instruction, learning from analogy, learning from examples, and learning from observation and discovery. The Reliability Analysis systems (RAs) presented in this thesis are mainly designed for maintaining plant safety supported by two functions: risk analysis function, i.e., failure mode effect analysis (FMEA) ; and diagnosis function, i.e., real-time fault location (RTFL). Three approaches have been discussed in creating the RAs. According to the result of our survey, we suggest currently the best design of RAs is to embed model-based RAs, i.e., MORA (as software) in a neural network based computer system (as hardware). However, there are still some improvement which can be made through the applications of Machine Learning. By implanting the 'learning element', the MORA will become learning MORA (La MORA) system, a learning Reliability Analysis system with the power of automatic knowledge acquisition and inconsistency checking, and more. To conclude our thesis, we propose an architecture of La MORA

    Rapid adaptation of video game AI

    Get PDF

    Towards optimizing entertainment in computer games

    Get PDF
    Mainly motivated by the current lack of a qualitative and quantitative entertainment formulation of computer games and the procedures to generate it, this article covers the following issues: It presents the features—extracted primarily from the opponent behavior—that make a predator/prey game appealing; provides the qualitative and quantitative means for measuring player entertainment in real time, and introduces a successful methodology for obtaining games of high satisfaction. This methodology is based on online (during play) learning opponents who demonstrate cooperative action. By testing the game against humans, we confirm our hypothesis that the proposed entertainment measure is consistent with the judgment of human players. As far as learning in real time against human players is concerned, results suggest that longer games are required for humans to notice some sort of change in their entertainment.peer-reviewe

    Advances in Reinforcement Learning

    Get PDF
    Reinforcement Learning (RL) is a very dynamic area in terms of theory and application. This book brings together many different aspects of the current research on several fields associated to RL which has been growing rapidly, producing a wide variety of learning algorithms for different applications. Based on 24 Chapters, it covers a very broad variety of topics in RL and their application in autonomous systems. A set of chapters in this book provide a general overview of RL while other chapters focus mostly on the applications of RL paradigms: Game Theory, Multi-Agent Theory, Robotic, Networking Technologies, Vehicular Navigation, Medicine and Industrial Logistic
    • …
    corecore