57,168 research outputs found

    Composable Deep Reinforcement Learning for Robotic Manipulation

    Full text link
    Model-free deep reinforcement learning has been shown to exhibit good performance in domains ranging from video games to simulated robotic manipulation and locomotion. However, model-free methods are known to perform poorly when the interaction time with the environment is limited, as is the case for most real-world robotic tasks. In this paper, we study how maximum entropy policies trained using soft Q-learning can be applied to real-world robotic manipulation. The application of this method to real-world manipulation is facilitated by two important features of soft Q-learning. First, soft Q-learning can learn multimodal exploration strategies by learning policies represented by expressive energy-based models. Second, we show that policies learned with soft Q-learning can be composed to create new policies, and that the optimality of the resulting policy can be bounded in terms of the divergence between the composed policies. This compositionality provides an especially valuable tool for real-world manipulation, where constructing new policies by composing existing skills can provide a large gain in efficiency over training from scratch. Our experimental evaluation demonstrates that soft Q-learning is substantially more sample efficient than prior model-free deep reinforcement learning methods, and that compositionality can be performed for both simulated and real-world tasks.Comment: Videos: https://sites.google.com/view/composing-real-world-policies

    Centralised rehearsal of decentralised cooperation: Multi-agent reinforcement learning for the scalable coordination of residential energy flexibility

    Full text link
    This paper investigates how deep multi-agent reinforcement learning can enable the scalable and privacy-preserving coordination of residential energy flexibility. The coordination of distributed resources such as electric vehicles and heating will be critical to the successful integration of large shares of renewable energy in our electricity grid and, thus, to help mitigate climate change. The pre-learning of individual reinforcement learning policies can enable distributed control with no sharing of personal data required during execution. However, previous approaches for multi-agent reinforcement learning-based distributed energy resources coordination impose an ever greater training computational burden as the size of the system increases. We therefore adopt a deep multi-agent actor-critic method which uses a \emph{centralised but factored critic} to rehearse coordination ahead of execution. Results show that coordination is achieved at scale, with minimal information and communication infrastructure requirements, no interference with daily activities, and privacy protection. Significant savings are obtained for energy users, the distribution network and greenhouse gas emissions. Moreover, training times are nearly 40 times shorter than with a previous state-of-the-art reinforcement learning approach without the factored critic for 30 homes

    Base Station Power Optimization for Green Networks Using Reinforcement Learning

    Get PDF
    The next generation mobile networks have to provide high data rates, extremely low latency, and support high connection density. To meet these requirements, the number of base stations will have to increase and this increase will lead to an energy consumption issue. Therefore “green” approaches to the network operation will gain importance. Reducing the energy consumption of base stations is essential for going green and also it helps service providers to reduce operational expenses. However, achieving energy savings without degrading the quality of service is a huge challenge. In order to address this issue, we propose a machine learning based intelligent solution that also incorporates a network simulator. We develop a reinforcement-based learning model by using deep deterministic policy gradient algorithm. Our model update frequently the policy of network switches in a way that, packet be forwarded to base stations with an optimized power level. The policies taken by the network controller are evaluated with a network simulator to ensure the energy consumption reduction and quality of service balance. The reinforcement learning model allows us to constantly learn and adapt to the changing situations in the dynamic network environment, hence having a more robust and realistic intelligent network management policy set. Our results demonstrate that energy efficiency can be enhanced by 32% and 67% in dense and sparse scenarios, respectively
    corecore