Search CORE

31,206 research outputs found

Reinforcement Learning for Racecar Control

Author: Cleland Benjamin George
Publication venue: The University of Waikato
Publication date: 01/01/2006
Field of study

This thesis investigates the use of reinforcement learning to learn to drive a racecar in the simulated environment of the Robot Automobile Racing Simulator. Real-life race driving is known to be difficult for humans, and expert human drivers use complex sequences of actions. There are a large number of variables, some of which change stochastically and all of which may affect the outcome. This makes driving a promising domain for testing and developing Machine Learning techniques that have the potential to be robust enough to work in the real world. Therefore the principles of the algorithms from this work may be applicable to a range of problems. The investigation starts by finding a suitable data structure to represent the information learnt. This is tested using supervised learning. Reinforcement learning is added and roughly tuned, and the supervised learning is then removed. A simple tabular representation is found satisfactory, and this avoids difficulties with more complex methods and allows the investigation to concentrate on the essentials of learning. Various reward sources are tested and a combination of three are found to produce the best performance. Exploration of the problem space is investigated. Results show exploration is essential but controlling how much is done is also important. It turns out the learning episodes need to be very long and because of this the task needs to be treated as continuous by using discounting to limit the size of the variables stored. Eligibility traces are used with success to make the learning more efficient. The tabular representation is made more compact by hashing and more accurate by using smaller buckets. This slows the learning but produces better driving. The improvement given by a rough form of generalisation indicates the replacement of the tabular method by a function approximator is warranted. These results show reinforcement learning can work within the Robot Automobile Racing Simulator, and lay the foundations for building a more efficient and competitive agent

Research Commons@Waikato

Combining case based reasoning with neural networks

Author: Murray-Smith R
Thakar S.
Publication venue: AAAI Press
Publication date: 01/01/1993
Field of study

This paper presents a neural network based technique for mapping problem situations to problem solutions for Case-Based Reasoning (CBR) applications. Both neural networks and CBR are instance-based learning techniques, although neural nets work with numerical data and CBR systems work with symbolic data. This paper discusses how the application scope of both paradigms could be enhanced by the use of hybrid concepts. To make the use of neural networks possible, the problem's situation and solution features are transformed into continuous features, using techniques similar to CBR's definition of similarity metrics. Radial Basis Function (RBF) neural nets are used to create a multivariable, continuous input-output mapping. As the mapping is continuous, this technique also provides generalisation between cases, replacing the domain specific solution adaptation techniques required by conventional CBR. This continuous representation also allows, as in fuzzy logic, an associated membership measure to be output with each symbolic feature, aiding the prioritisation of various possible solutions. A further advantage is that, as the RBF neurons are only active in a limited area of the input space, the solution can be accompanied by local estimates of accuracy, based on the sufficiency of the cases present in that area as well as the results measured during testing. We describe how the application of this technique could be of benefit to the real world problem of sales advisory systems, among others

Enlighten

Combining case based reasoning with neural networks

Author: Murray-Smith R
Thakar S.
Publication venue: AAAI Press
Publication date: 01/01/1993
Field of study

CiteSeerX

Enlighten

The impact of source-side syntactic reordering on hierarchical phrase-based SMT

Author: Du Jinhua
Way Andy
Publication venue: European Association for Machine Translation
Publication date: 01/05/2010
Field of study

Syntactic reordering has been demonstrated to be helpful and effective for handling different word orders between source and target languages in SMT. However, in terms of hierarchial PB-SMT (HPB), does the syntactic reordering still has a significant impact on its performance? This paper introduces a reordering approach which explores the { (DE) grammatical structure in Chinese. We employ the Stanford DE classifier to recognise the DE structures in both training and test sentences of Chinese, and then perform word reordering to make the Chinese sentences better match the word order of English. The annotated and reordered training data and test data are applied to a re-implemented HPB system and the impact of the DE construction is examined. The experiments are conducted on the NIST 2008 evaluation data and experimental results show that the BLEU and METEOR scores are significantly improved by 1.83/8.91 and 1.17/2.73 absolute/ relative points respectively

CiteSeerX

Irish Universities

DCU Online Research Access Service

Addressing Appearance Change in Outdoor Robotics with Adversarial Domain Adaptation

Author: Bewley Alex
Posner Ingmar
Wulfmeier Markus
Publication venue
Publication date: 01/01/2017
Field of study

Appearance changes due to weather and seasonal conditions represent a strong impediment to the robust implementation of machine learning systems in outdoor robotics. While supervised learning optimises a model for the training domain, it will deliver degraded performance in application domains that underlie distributional shifts caused by these changes. Traditionally, this problem has been addressed via the collection of labelled data in multiple domains or by imposing priors on the type of shift between both domains. We frame the problem in the context of unsupervised domain adaptation and develop a framework for applying adversarial techniques to adapt popular, state-of-the-art network architectures with the additional objective to align features across domains. Moreover, as adversarial training is notoriously unstable, we first perform an extensive ablation study, adapting many techniques known to stabilise generative adversarial networks, and evaluate on a surrogate classification task with the same appearance change. The distilled insights are applied to the problem of free-space segmentation for motion planning in autonomous driving.Comment: In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2017

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive