Search CORE

20,508 research outputs found

Recommended from our members

Towards Informed Exploration for Deep Reinforcement Learning

Author: Tang Haoran
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

In this thesis, we discuss various techniques for improving exploration for deep reinforcement learning. We begin with a brief review of reinforcement learning (RL) and the fundamental v.s. exploitation trade-off. Then we review how deep RL has improved upon classical and summarize six categories of the latest exploration methods for deep RL, in the order increasing usage of prior information. We then explore representative works in three categories discuss their strengths and weaknesses. The first category, represented by Soft Q-learning, uses regularization to encourage exploration. The second category, represented by count-based via hashing, maps states to hash codes for counting and assigns higher exploration to less-encountered states. The third category utilizes hierarchy and is represented by modular architecture for RL agents to play StarCraft II. Finally, we conclude that exploration by prior knowledge is a promising research direction and suggest topics of potentially impact

eScholarship - University of California

Realistic Traffic Generation for Web Robots

Author: Brown Kyle
Doran Derek
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/12/2017
Field of study

Critical to evaluating the capacity, scalability, and availability of web systems are realistic web traffic generators. Web traffic generation is a classic research problem, no generator accounts for the characteristics of web robots or crawlers that are now the dominant source of traffic to a web server. Administrators are thus unable to test, stress, and evaluate how their systems perform in the face of ever increasing levels of web robot traffic. To resolve this problem, this paper introduces a novel approach to generate synthetic web robot traffic with high fidelity. It generates traffic that accounts for both the temporal and behavioral qualities of robot traffic by statistical and Bayesian models that are fitted to the properties of robot traffic seen in web logs from North America and Europe. We evaluate our traffic generator by comparing the characteristics of generated traffic to those of the original data. We look at session arrival rates, inter-arrival times and session lengths, comparing and contrasting them between generated and real traffic. Finally, we show that our generated traffic affects cache performance similarly to actual traffic, using the common LRU and LFU eviction policies.Comment: 8 page

arXiv.org e-Print Archive

Crossref

Sampling-based Motion Planning for Active Multirotor System Identification

Author: Burri Michael
Bähnemann Rik
Galceran Enric
Nieto Juan
Siegwart Roland
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/02/2017
Field of study

This paper reports on an algorithm for planning trajectories that allow a multirotor micro aerial vehicle (MAV) to quickly identify a set of unknown parameters. In many problems like self calibration or model parameter identification some states are only observable under a specific motion. These motions are often hard to find, especially for inexperienced users. Therefore, we consider system model identification in an active setting, where the vehicle autonomously decides what actions to take in order to quickly identify the model. Our algorithm approximates the belief dynamics of the system around a candidate trajectory using an extended Kalman filter (EKF). It uses sampling-based motion planning to explore the space of possible beliefs and find a maximally informative trajectory within a user-defined budget. We validate our method in simulation and on a real system showing the feasibility and repeatability of the proposed approach. Our planner creates trajectories which reduce model parameter convergence time and uncertainty by a factor of four.Comment: Published at ICRA 2017. Video available at https://www.youtube.com/watch?v=xtqrWbgep5

arXiv.org e-Print Archive

Crossref