8,910 research outputs found
Reinforcement Learning for Racecar Control
This thesis investigates the use of reinforcement learning to learn to drive a racecar in the simulated environment of the Robot Automobile Racing Simulator. Real-life race driving is known to be difficult for humans, and expert human drivers use complex sequences of actions. There are a large number of variables, some of which change stochastically and all of which may affect the outcome. This makes driving a promising domain for testing and developing Machine Learning techniques that have the potential to be robust enough to work in the real world. Therefore the principles of the algorithms from this work may be applicable to a range of problems.
The investigation starts by finding a suitable data structure to represent the information learnt. This is tested using supervised learning. Reinforcement learning is added and roughly tuned, and the supervised learning is then removed. A simple tabular representation is found satisfactory, and this avoids difficulties with more complex methods and allows the investigation to concentrate on the essentials of learning. Various reward sources are tested and a combination of three are found to produce the best performance. Exploration of the problem space is investigated. Results show exploration is essential but controlling how much is done is also important. It turns out the learning episodes need to be very long and because of this the task needs to be treated as continuous by using discounting to limit the size of the variables stored. Eligibility traces are used with success to make the learning more efficient. The tabular representation is made more compact by hashing and more accurate by using smaller buckets. This slows the learning but produces better driving. The improvement given by a rough form of generalisation indicates the replacement of the tabular method by a function approximator is warranted. These results show reinforcement learning can work within the Robot Automobile Racing Simulator, and lay the foundations for building a more efficient and competitive agent
The path inference filter: model-based low-latency map matching of probe vehicle data
We consider the problem of reconstructing vehicle trajectories from sparse
sequences of GPS points, for which the sampling interval is between 10 seconds
and 2 minutes. We introduce a new class of algorithms, called altogether path
inference filter (PIF), that maps GPS data in real time, for a variety of
trade-offs and scenarios, and with a high throughput. Numerous prior approaches
in map-matching can be shown to be special cases of the path inference filter
presented in this article. We present an efficient procedure for automatically
training the filter on new data, with or without ground truth observations. The
framework is evaluated on a large San Francisco taxi dataset and is shown to
improve upon the current state of the art. This filter also provides insights
about driving patterns of drivers. The path inference filter has been deployed
at an industrial scale inside the Mobile Millennium traffic information system,
and is used to map fleets of data in San Francisco, Sacramento, Stockholm and
Porto.Comment: Preprint, 23 pages and 23 figure
Modelling supported driving as an optimal control cycle: Framework and model characteristics
Driver assistance systems support drivers in operating vehicles in a safe,
comfortable and efficient way, and thus may induce changes in traffic flow
characteristics. This paper puts forward a receding horizon control framework
to model driver assistance and cooperative systems. The accelerations of
automated vehicles are controlled to optimise a cost function, assuming other
vehicles driving at stationary conditions over a prediction horizon. The
flexibility of the framework is demonstrated with controller design of Adaptive
Cruise Control (ACC) and Cooperative ACC (C-ACC) systems. The proposed ACC and
C-ACC model characteristics are investigated analytically, with focus on
equilibrium solutions and stability properties. The proposed ACC model produces
plausible human car-following behaviour and is unconditionally locally stable.
By careful tuning of parameters, the ACC model generates similar stability
characteristics as human driver models. The proposed C-ACC model results in
convective downstream and absolute string instability, but not convective
upstream string instability observed in human-driven traffic and in the ACC
model. The control framework and analytical results provide insights into the
influences of ACC and C-ACC systems on traffic flow operations.Comment: Submitted to Transportation Research Part C: Emerging Technologie
Empowerment for Continuous Agent-Environment Systems
This paper develops generalizations of empowerment to continuous states.
Empowerment is a recently introduced information-theoretic quantity motivated
by hypotheses about the efficiency of the sensorimotor loop in biological
organisms, but also from considerations stemming from curiosity-driven
learning. Empowemerment measures, for agent-environment systems with stochastic
transitions, how much influence an agent has on its environment, but only that
influence that can be sensed by the agent sensors. It is an
information-theoretic generalization of joint controllability (influence on
environment) and observability (measurement by sensors) of the environment by
the agent, both controllability and observability being usually defined in
control theory as the dimensionality of the control/observation spaces. Earlier
work has shown that empowerment has various interesting and relevant
properties, e.g., it allows us to identify salient states using only the
dynamics, and it can act as intrinsic reward without requiring an external
reward. However, in this previous work empowerment was limited to the case of
small-scale and discrete domains and furthermore state transition probabilities
were assumed to be known. The goal of this paper is to extend empowerment to
the significantly more important and relevant case of continuous vector-valued
state spaces and initially unknown state transition probabilities. The
continuous state space is addressed by Monte-Carlo approximation; the unknown
transitions are addressed by model learning and prediction for which we apply
Gaussian processes regression with iterated forecasting. In a number of
well-known continuous control tasks we examine the dynamics induced by
empowerment and include an application to exploration and online model
learning
Statistical Physics of Vehicular Traffic and Some Related Systems
In the so-called "microscopic" models of vehicular traffic, attention is paid
explicitly to each individual vehicle each of which is represented by a
"particle"; the nature of the "interactions" among these particles is
determined by the way the vehicles influence each others' movement. Therefore,
vehicular traffic, modeled as a system of interacting "particles" driven far
from equilibrium, offers the possibility to study various fundamental aspects
of truly nonequilibrium systems which are of current interest in statistical
physics. Analytical as well as numerical techniques of statistical physics are
being used to study these models to understand rich variety of physical
phenomena exhibited by vehicular traffic. Some of these phenomena, observed in
vehicular traffic under different circumstances, include transitions from one
dynamical phase to another, criticality and self-organized criticality,
metastability and hysteresis, phase-segregation, etc. In this critical review,
written from the perspective of statistical physics, we explain the guiding
principles behind all the main theoretical approaches. But we present detailed
discussions on the results obtained mainly from the so-called
"particle-hopping" models, particularly emphasizing those which have been
formulated in recent years using the language of cellular automata.Comment: 170 pages, Latex, figures include
- âŠ