12,251 research outputs found
Driving with Style: Inverse Reinforcement Learning in General-Purpose Planning for Automated Driving
Behavior and motion planning play an important role in automated driving.
Traditionally, behavior planners instruct local motion planners with predefined
behaviors. Due to the high scene complexity in urban environments,
unpredictable situations may occur in which behavior planners fail to match
predefined behavior templates. Recently, general-purpose planners have been
introduced, combining behavior and local motion planning. These general-purpose
planners allow behavior-aware motion planning given a single reward function.
However, two challenges arise: First, this function has to map a complex
feature space into rewards. Second, the reward function has to be manually
tuned by an expert. Manually tuning this reward function becomes a tedious
task. In this paper, we propose an approach that relies on human driving
demonstrations to automatically tune reward functions. This study offers
important insights into the driving style optimization of general-purpose
planners with maximum entropy inverse reinforcement learning. We evaluate our
approach based on the expected value difference between learned and
demonstrated policies. Furthermore, we compare the similarity of human driven
trajectories with optimal policies of our planner under learned and
expert-tuned reward functions. Our experiments show that we are able to learn
reward functions exceeding the level of manual expert tuning without prior
domain knowledge.Comment: Appeared at IROS 2019. Accepted version. Added/updated footnote,
minor correction in preliminarie
Learning to infer: RL-based search for DNN primitive selection on Heterogeneous Embedded Systems
Deep Learning is increasingly being adopted by industry for computer vision
applications running on embedded devices. While Convolutional Neural Networks'
accuracy has achieved a mature and remarkable state, inference latency and
throughput are a major concern especially when targeting low-cost and low-power
embedded platforms. CNNs' inference latency may become a bottleneck for Deep
Learning adoption by industry, as it is a crucial specification for many
real-time processes. Furthermore, deployment of CNNs across heterogeneous
platforms presents major compatibility issues due to vendor-specific technology
and acceleration libraries. In this work, we present QS-DNN, a fully automatic
search based on Reinforcement Learning which, combined with an inference engine
optimizer, efficiently explores through the design space and empirically finds
the optimal combinations of libraries and primitives to speed up the inference
of CNNs on heterogeneous embedded devices. We show that, an optimized
combination can achieve 45x speedup in inference latency on CPU compared to a
dependency-free baseline and 2x on average on GPGPU compared to the best vendor
library. Further, we demonstrate that, the quality of results and time
"to-solution" is much better than with Random Search and achieves up to 15x
better results for a short-time search
A Policy Search Method For Temporal Logic Specified Reinforcement Learning Tasks
Reward engineering is an important aspect of reinforcement learning. Whether
or not the user's intentions can be correctly encapsulated in the reward
function can significantly impact the learning outcome. Current methods rely on
manually crafted reward functions that often require parameter tuning to obtain
the desired behavior. This operation can be expensive when exploration requires
systems to interact with the physical world. In this paper, we explore the use
of temporal logic (TL) to specify tasks in reinforcement learning. TL formula
can be translated to a real-valued function that measures its level of
satisfaction against a trajectory. We take advantage of this function and
propose temporal logic policy search (TLPS), a model-free learning technique
that finds a policy that satisfies the TL specification. A set of simulated
experiments are conducted to evaluate the proposed approach
Recommended from our members
Towards Informed Exploration for Deep Reinforcement Learning
In this thesis, we discuss various techniques for improving exploration for deep reinforcement learning. We begin with a brief review of reinforcement learning (RL) and the fundamental v.s. exploitation trade-off. Then we review how deep RL has improved upon classical and summarize six categories of the latest exploration methods for deep RL, in the order increasing usage of prior information. We then explore representative works in three categories discuss their strengths and weaknesses. The first category, represented by Soft Q-learning, uses regularization to encourage exploration. The second category, represented by count-based via hashing, maps states to hash codes for counting and assigns higher exploration to less-encountered states. The third category utilizes hierarchy and is represented by modular architecture for RL agents to play StarCraft II. Finally, we conclude that exploration by prior knowledge is a promising research direction and suggest topics of potentially impact
- …