3,373 research outputs found

    Quantum inspired algorithms for learning and control of stochastic systems

    Get PDF
    Motivated by the limitations of the current reinforcement learning and optimal control techniques, this dissertation proposes quantum theory inspired algorithms for learning and control of both single-agent and multi-agent stochastic systems. A common problem encountered in traditional reinforcement learning techniques is the exploration-exploitation trade-off. To address the above issue an action selection procedure inspired by a quantum search algorithm called Grover\u27s iteration is developed. This procedure does not require an explicit design parameter to specify the relative frequency of explorative/exploitative actions. The second part of this dissertation extends the powerful adaptive critic design methodology to solve finite horizon stochastic optimal control problems. To numerically solve the stochastic Hamilton Jacobi Bellman equation, which characterizes the optimal expected cost function, large number of trajectory samples are required. The proposed methodology overcomes the above difficulty by using the path integral control formulation to adaptively sample trajectories of importance. The third part of this dissertation presents two quantum inspired coordination models to dynamically assign targets to agents operating in a stochastic environment. The first approach uses a quantum decision theory model that explains irrational action choices in human decision making. The second approach uses a quantum game theory model that exploits the quantum mechanical phenomena \u27entanglement\u27 to increase individual pay-off in multi-player games. The efficiency and scalability of the proposed coordination models are demonstrated through simulations of a large scale multi-agent system --Abstract, page iii

    Reinforcement Learning: A Survey

    Full text link
    This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.Comment: See http://www.jair.org/ for any accompanying file

    Deep Q-Learning for Nash Equilibria: Nash-DQN

    Full text link
    Model-free learning for multi-agent stochastic games is an active area of research. Existing reinforcement learning algorithms, however, are often restricted to zero-sum games, and are applicable only in small state-action spaces or other simplified settings. Here, we develop a new data efficient Deep-Q-learning methodology for model-free learning of Nash equilibria for general-sum stochastic games. The algorithm uses a local linear-quadratic expansion of the stochastic game, which leads to analytically solvable optimal actions. The expansion is parametrized by deep neural networks to give it sufficient flexibility to learn the environment without the need to experience all state-action pairs. We study symmetry properties of the algorithm stemming from label-invariant stochastic games and as a proof of concept, apply our algorithm to learning optimal trading strategies in competitive electronic markets.Comment: 16 pages, 4 figure

    Advances in Reinforcement Learning

    Get PDF
    Reinforcement Learning (RL) is a very dynamic area in terms of theory and application. This book brings together many different aspects of the current research on several fields associated to RL which has been growing rapidly, producing a wide variety of learning algorithms for different applications. Based on 24 Chapters, it covers a very broad variety of topics in RL and their application in autonomous systems. A set of chapters in this book provide a general overview of RL while other chapters focus mostly on the applications of RL paradigms: Game Theory, Multi-Agent Theory, Robotic, Networking Technologies, Vehicular Navigation, Medicine and Industrial Logistic

    Putting artificial intelligence into wearable human-machine interfaces – towards a generic, self-improving controller

    Get PDF
    The standard approach to creating a machine learning based controller is to provide users with a number of gestures that they need to make; record multiple instances of each gesture using specific sensors; extract the relevant sensor data and pass it through a supervised learning algorithm until the algorithm can successfully identify the gestures; map each gesture to a control signal that performs a desired outcome. This approach is both inflexible and time consuming. The primary contribution of this research was to investigate a new approach to putting artificial intelligence into wearable human-machine interfaces by creating a Generic, Self-Improving Controller. It was shown to learn two user-defined static gestures with an accuracy of 100% in less than 10 samples per gesture; three in less than 20 samples per gesture; and four in less than 35 samples per gesture. Pre-defined dynamic gestures were more difficult to learn. It learnt two with an accuracy of 90% in less than 6,000 samples per gesture; and four with an accuracy of 70% after 50,000 samples per gesture. The research has resulted in a number of additional contributions: • The creation of a source-independent hardware data capture, processing, fusion and storage tool for standardising the capture and storage of historical copies of data captured from multiple different sensors. • An improved Attitude and Heading Reference System (AHRS) algorithm for calculating orientation quaternions that is five orders of magnitude more precise. • The reformulation of the regularised TD learning algorithm; the reformulation of the TD learning algorithm applied the artificial neural network back-propagation algorithm; and the combination of the reformulations into a new, regularised TD learning algorithm applied to the artificial neural network back-propagation algorithm. • The creation of a Generic, Self-Improving Predictor that can use different learning algorithms and a Flexible Artificial Neural Network.Open Acces
    • …
    corecore