20,508 research outputs found
Recommended from our members
Towards Informed Exploration for Deep Reinforcement Learning
In this thesis, we discuss various techniques for improving exploration for deep reinforcement learning. We begin with a brief review of reinforcement learning (RL) and the fundamental v.s. exploitation trade-off. Then we review how deep RL has improved upon classical and summarize six categories of the latest exploration methods for deep RL, in the order increasing usage of prior information. We then explore representative works in three categories discuss their strengths and weaknesses. The first category, represented by Soft Q-learning, uses regularization to encourage exploration. The second category, represented by count-based via hashing, maps states to hash codes for counting and assigns higher exploration to less-encountered states. The third category utilizes hierarchy and is represented by modular architecture for RL agents to play StarCraft II. Finally, we conclude that exploration by prior knowledge is a promising research direction and suggest topics of potentially impact
Realistic Traffic Generation for Web Robots
Critical to evaluating the capacity, scalability, and availability of web
systems are realistic web traffic generators. Web traffic generation is a
classic research problem, no generator accounts for the characteristics of web
robots or crawlers that are now the dominant source of traffic to a web server.
Administrators are thus unable to test, stress, and evaluate how their systems
perform in the face of ever increasing levels of web robot traffic. To resolve
this problem, this paper introduces a novel approach to generate synthetic web
robot traffic with high fidelity. It generates traffic that accounts for both
the temporal and behavioral qualities of robot traffic by statistical and
Bayesian models that are fitted to the properties of robot traffic seen in web
logs from North America and Europe. We evaluate our traffic generator by
comparing the characteristics of generated traffic to those of the original
data. We look at session arrival rates, inter-arrival times and session
lengths, comparing and contrasting them between generated and real traffic.
Finally, we show that our generated traffic affects cache performance similarly
to actual traffic, using the common LRU and LFU eviction policies.Comment: 8 page
Sampling-based Motion Planning for Active Multirotor System Identification
This paper reports on an algorithm for planning trajectories that allow a
multirotor micro aerial vehicle (MAV) to quickly identify a set of unknown
parameters. In many problems like self calibration or model parameter
identification some states are only observable under a specific motion. These
motions are often hard to find, especially for inexperienced users. Therefore,
we consider system model identification in an active setting, where the vehicle
autonomously decides what actions to take in order to quickly identify the
model. Our algorithm approximates the belief dynamics of the system around a
candidate trajectory using an extended Kalman filter (EKF). It uses
sampling-based motion planning to explore the space of possible beliefs and
find a maximally informative trajectory within a user-defined budget. We
validate our method in simulation and on a real system showing the feasibility
and repeatability of the proposed approach. Our planner creates trajectories
which reduce model parameter convergence time and uncertainty by a factor of
four.Comment: Published at ICRA 2017. Video available at
https://www.youtube.com/watch?v=xtqrWbgep5
- …