Search CORE

11,951 research outputs found

Recommended from our members

Towards Informed Exploration for Deep Reinforcement Learning

Author: Tang Haoran
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

In this thesis, we discuss various techniques for improving exploration for deep reinforcement learning. We begin with a brief review of reinforcement learning (RL) and the fundamental v.s. exploitation trade-off. Then we review how deep RL has improved upon classical and summarize six categories of the latest exploration methods for deep RL, in the order increasing usage of prior information. We then explore representative works in three categories discuss their strengths and weaknesses. The first category, represented by Soft Q-learning, uses regularization to encourage exploration. The second category, represented by count-based via hashing, maps states to hash codes for counting and assigns higher exploration to less-encountered states. The third category utilizes hierarchy and is represented by modular architecture for RL agents to play StarCraft II. Finally, we conclude that exploration by prior knowledge is a promising research direction and suggest topics of potentially impact

eScholarship - University of California

Modeling Profit of Sliced 5G Networks for Advanced Network Resource Management and Slice Implementation

Author: Han Bin
Schotten Hans D.
Tayade Shreya
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/09/2017
Field of study

The core innovation in future 5G cellular networksnetwork slicing, aims at providing a flexible and efficient framework of network organization and resource management. The revolutionary network architecture based on slices, makes most of the current network cost models obsolete, as they estimate the expenditures in a static manner. In this paper, a novel methodology is proposed, in which a value chain in sliced networks is presented. Based on the proposed value chain, the profits generated by different slices are analyzed, and the task of network resource management is modeled as a multiobjective optimization problem. Setting strong assumptions, this optimization problem is analyzed starting from a simple ideal scenario. By removing the assumptions step-by-step, realistic but complex use cases are approached. Through this progressive analysis, technical challenges in slice implementation and network optimization are investigated under different scenarios. For each challenge, some potentially available solutions are suggested, and likely applications are also discussed

arXiv.org e-Print Archive

Deep Reinforcement Learning from Self-Play in Imperfect-Information Games

Author: Heinrich Johannes
Silver David
Publication venue
Publication date: 03/03/2016
Field of study

Many real-world applications can be described as large-scale games of imperfect information. To deal with these challenging domains, prior work has focused on computing Nash equilibria in a handcrafted abstraction of the domain. In this paper we introduce the first scalable end-to-end approach to learning approximate Nash equilibria without prior domain knowledge. Our method combines fictitious self-play with deep reinforcement learning. When applied to Leduc poker, Neural Fictitious Self-Play (NFSP) approached a Nash equilibrium, whereas common reinforcement learning methods diverged. In Limit Texas Holdem, a poker game of real-world scale, NFSP learnt a strategy that approached the performance of state-of-the-art, superhuman algorithms based on significant domain expertise.Comment: updated version, incorporating conference feedbac

arXiv.org e-Print Archive

Ms Pac-Man versus Ghost Team CEC 2011 competition

Author: Lucas Simon M
Rohlfshagen Philipp
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/07/2011
Field of study

Games provide an ideal test bed for computational intelligence and significant progress has been made in recent years, most notably in games such as Go, where the level of play is now competitive with expert human play on smaller boards. Recently, a significantly more complex class of games has received increasing attention: real-time video games. These games pose many new challenges, including strict time constraints, simultaneous moves and open-endedness. Unlike in traditional board games, computational play is generally unable to compete with human players. One driving force in improving the overall performance of artificial intelligence players are game competitions where practitioners may evaluate and compare their methods against those submitted by others and possibly human players as well. In this paper we introduce a new competition based on the popular arcade video game Ms Pac-Man: Ms Pac-Man versus Ghost Team. The competition, to be held at the Congress on Evolutionary Computation 2011 for the first time, allows participants to develop controllers for either the Ms Pac-Man agent or for the Ghost Team and unlike previous Ms Pac-Man competitions that relied on screen capture, the players now interface directly with the game engine. In this paper we introduce the competition, including a review of previous work as well as a discussion of several aspects regarding the setting up of the game competition itself. © 2011 IEEE