4,663 research outputs found

    Toward Real-Time Decentralized Reinforcement Learning using Finite Support Basis Functions

    Full text link
    This paper addresses the design and implementation of complex Reinforcement Learning (RL) behaviors where multi-dimensional action spaces are involved, as well as the need to execute the behaviors in real-time using robotic platforms with limited computational resources and training times. For this purpose, we propose the use of decentralized RL, in combination with finite support basis functions as alternatives to Gaussian RBF, in order to alleviate the effects of the curse of dimensionality on the action and state spaces respectively, and to reduce the computation time. As testbed, a RL based controller for the in-walk kick in NAO robots, a challenging and critical problem for soccer robotics, is used. The reported experiments show empirically that our solution saves up to 99.94% of execution time and 98.82% of memory consumption during execution, without diminishing performance compared to classical approaches.Comment: Accepted in the RoboCup Symposium 2017. Final version will be published at Springe

    Information driven self-organization of complex robotic behaviors

    Get PDF
    Information theory is a powerful tool to express principles to drive autonomous systems because it is domain invariant and allows for an intuitive interpretation. This paper studies the use of the predictive information (PI), also called excess entropy or effective measure complexity, of the sensorimotor process as a driving force to generate behavior. We study nonlinear and nonstationary systems and introduce the time-local predicting information (TiPI) which allows us to derive exact results together with explicit update rules for the parameters of the controller in the dynamical systems framework. In this way the information principle, formulated at the level of behavior, is translated to the dynamics of the synapses. We underpin our results with a number of case studies with high-dimensional robotic systems. We show the spontaneous cooperativity in a complex physical system with decentralized control. Moreover, a jointly controlled humanoid robot develops a high behavioral variety depending on its physics and the environment it is dynamically embedded into. The behavior can be decomposed into a succession of low-dimensional modes that increasingly explore the behavior space. This is a promising way to avoid the curse of dimensionality which hinders learning systems to scale well.Comment: 29 pages, 12 figure

    Learning and Management for Internet-of-Things: Accounting for Adaptivity and Scalability

    Get PDF
    Internet-of-Things (IoT) envisions an intelligent infrastructure of networked smart devices offering task-specific monitoring and control services. The unique features of IoT include extreme heterogeneity, massive number of devices, and unpredictable dynamics partially due to human interaction. These call for foundational innovations in network design and management. Ideally, it should allow efficient adaptation to changing environments, and low-cost implementation scalable to massive number of devices, subject to stringent latency constraints. To this end, the overarching goal of this paper is to outline a unified framework for online learning and management policies in IoT through joint advances in communication, networking, learning, and optimization. From the network architecture vantage point, the unified framework leverages a promising fog architecture that enables smart devices to have proximity access to cloud functionalities at the network edge, along the cloud-to-things continuum. From the algorithmic perspective, key innovations target online approaches adaptive to different degrees of nonstationarity in IoT dynamics, and their scalable model-free implementation under limited feedback that motivates blind or bandit approaches. The proposed framework aspires to offer a stepping stone that leads to systematic designs and analysis of task-specific learning and management schemes for IoT, along with a host of new research directions to build on.Comment: Submitted on June 15 to Proceeding of IEEE Special Issue on Adaptive and Scalable Communication Network

    Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks

    Full text link
    Future wireless networks have a substantial potential in terms of supporting a broad range of complex compelling applications both in military and civilian fields, where the users are able to enjoy high-rate, low-latency, low-cost and reliable information services. Achieving this ambitious goal requires new radio techniques for adaptive learning and intelligent decision making because of the complex heterogeneous nature of the network structures and wireless services. Machine learning (ML) algorithms have great success in supporting big data analytics, efficient parameter estimation and interactive decision making. Hence, in this article, we review the thirty-year history of ML by elaborating on supervised learning, unsupervised learning, reinforcement learning and deep learning. Furthermore, we investigate their employment in the compelling applications of wireless networks, including heterogeneous networks (HetNets), cognitive radios (CR), Internet of things (IoT), machine to machine networks (M2M), and so on. This article aims for assisting the readers in clarifying the motivation and methodology of the various ML algorithms, so as to invoke them for hitherto unexplored services as well as scenarios of future wireless networks.Comment: 46 pages, 22 fig

    Selectively decentralized reinforcement learning

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)The main contributions in this thesis include the selectively decentralized method in solving multi-agent reinforcement learning problems and the discretized Markov-decision-process (MDP) algorithm to compute the sub-optimal learning policy in completely unknown learning and control problems. These contributions tackle several challenges in multi-agent reinforcement learning: the unknown and dynamic nature of the learning environment, the difficulty in computing the closed-form solution of the learning problem, the slow learning performance in large-scale systems, and the questions of how/when/to whom the learning agents should communicate among themselves. Through this thesis, the selectively decentralized method, which evaluates all of the possible communicative strategies, not only increases the learning speed, achieves better learning goals but also could learn the communicative policy for each learning agent. Compared to the other state-of-the-art approaches, this thesis’s contributions offer two advantages. First, the selectively decentralized method could incorporate a wide range of well-known algorithms, including the discretized MDP, in single-agent reinforcement learning; meanwhile, the state-of-the-art approaches usually could be applied for one class of algorithms. Second, the discretized MDP algorithm could compute the sub-optimal learning policy when the environment is described in general nonlinear format; meanwhile, the other state-of-the-art approaches often assume that the environment is in limited format, particularly in feedback-linearization form. This thesis also discusses several alternative approaches for multi-agent learning, including Multidisciplinary Optimization. In addition, this thesis shows how the selectively decentralized method could successfully solve several real-worlds problems, particularly in mechanical and biological systems

    ToyArchitecture: Unsupervised Learning of Interpretable Models of the World

    Full text link
    Research in Artificial Intelligence (AI) has focused mostly on two extremes: either on small improvements in narrow AI domains, or on universal theoretical frameworks which are usually uncomputable, incompatible with theories of biological intelligence, or lack practical implementations. The goal of this work is to combine the main advantages of the two: to follow a big picture view, while providing a particular theory and its implementation. In contrast with purely theoretical approaches, the resulting architecture should be usable in realistic settings, but also form the core of a framework containing all the basic mechanisms, into which it should be easier to integrate additional required functionality. In this paper, we present a novel, purposely simple, and interpretable hierarchical architecture which combines multiple different mechanisms into one system: unsupervised learning of a model of the world, learning the influence of one's own actions on the world, model-based reinforcement learning, hierarchical planning and plan execution, and symbolic/sub-symbolic integration in general. The learned model is stored in the form of hierarchical representations with the following properties: 1) they are increasingly more abstract, but can retain details when needed, and 2) they are easy to manipulate in their local and symbolic-like form, thus also allowing one to observe the learning process at each level of abstraction. On all levels of the system, the representation of the data can be interpreted in both a symbolic and a sub-symbolic manner. This enables the architecture to learn efficiently using sub-symbolic methods and to employ symbolic inference.Comment: Revision: changed the pdftitl

    Reinforcement Learning

    Get PDF
    Brains rule the world, and brain-like computation is increasingly used in computers and electronic devices. Brain-like computation is about processing and interpreting data or directly putting forward and performing actions. Learning is a very important aspect. This book is on reinforcement learning which involves performing actions to achieve a goal. The first 11 chapters of this book describe and extend the scope of reinforcement learning. The remaining 11 chapters show that there is already wide usage in numerous fields. Reinforcement learning can tackle control tasks that are too complex for traditional, hand-designed, non-learning controllers. As learning computers can deal with technical complexities, the tasks of human operators remain to specify goals on increasingly higher levels. This book shows that reinforcement learning is a very dynamic area in terms of theory and applications and it shall stimulate and encourage new research in this field

    Inductive Approaches Based on Trial/Error Paradigm for Communications Network

    Get PDF
    corecore