71,802 research outputs found
Automating Vehicles by Deep Reinforcement Learning using Task Separation with Hill Climbing
Within the context of autonomous driving a model-based reinforcement learning
algorithm is proposed for the design of neural network-parameterized
controllers. Classical model-based control methods, which include sampling- and
lattice-based algorithms and model predictive control, suffer from the
trade-off between model complexity and computational burden required for the
online solution of expensive optimization or search problems at every short
sampling time. To circumvent this trade-off, a 2-step procedure is motivated:
first learning of a controller during offline training based on an arbitrarily
complicated mathematical system model, before online fast feedforward
evaluation of the trained controller. The contribution of this paper is the
proposition of a simple gradient-free and model-based algorithm for deep
reinforcement learning using task separation with hill climbing (TSHC). In
particular, (i) simultaneous training on separate deterministic tasks with the
purpose of encoding many motion primitives in a neural network, and (ii) the
employment of maximally sparse rewards in combination with virtual velocity
constraints (VVCs) in setpoint proximity are advocated.Comment: 10 pages, 6 figures, 1 tabl
Forcing neurocontrollers to exploit sensory symmetry through hard-wired modularity in the game of Cellz
Several attempts have been made in the past to construct encoding schemes that allow modularity to emerge in evolving systems, but success is limited. We believe that in order to create successful and scalable encodings for emerging modularity, we first need to explore the benefits of different types of modularity by hard-wiring these into evolvable systems. In this paper we explore different ways of exploiting sensory symmetry inherent in the agent in the simple game Cellz by evolving symmetrically identical modules. It is concluded that significant increases in both speed of evolution and final fitness can be achieved relative to monolithic controllers. Furthermore, we show that a simple function approximation task that exhibits sensory symmetry can be used as a quick approximate measure of the utility of an encoding scheme for the more complex game-playing task
Network Design with Coverage Costs
We study network design with a cost structure motivated by redundancy in data
traffic. We are given a graph, g groups of terminals, and a universe of data
packets. Each group of terminals desires a subset of the packets from its
respective source. The cost of routing traffic on any edge in the network is
proportional to the total size of the distinct packets that the edge carries.
Our goal is to find a minimum cost routing. We focus on two settings. In the
first, the collection of packet sets desired by source-sink pairs is laminar.
For this setting, we present a primal-dual based 2-approximation, improving
upon a logarithmic approximation due to Barman and Chawla (2012). In the second
setting, packet sets can have non-trivial intersection. We focus on the case
where each packet is desired by either a single terminal group or by all of the
groups, and the graph is unweighted. For this setting we present an O(log
g)-approximation.
Our approximation for the second setting is based on a novel spanner-type
construction in unweighted graphs that, given a collection of g vertex subsets,
finds a subgraph of cost only a constant factor more than the minimum spanning
tree of the graph, such that every subset in the collection has a Steiner tree
in the subgraph of cost at most O(log g) that of its minimum Steiner tree in
the original graph. We call such a subgraph a group spanner.Comment: Updated version with additional result
From Data Topology to a Modular Classifier
This article describes an approach to designing a distributed and modular
neural classifier. This approach introduces a new hierarchical clustering that
enables one to determine reliable regions in the representation space by
exploiting supervised information. A multilayer perceptron is then associated
with each of these detected clusters and charged with recognizing elements of
the associated cluster while rejecting all others. The obtained global
classifier is comprised of a set of cooperating neural networks and completed
by a K-nearest neighbor classifier charged with treating elements rejected by
all the neural networks. Experimental results for the handwritten digit
recognition problem and comparison with neural and statistical nonmodular
classifiers are given
- …