Search CORE

112,498 research outputs found

A vision-guided parallel parking system for a mobile robot using approximate policy iteration

Author: Duckett Tom
Shaker Marwan
Yue Shigang
Publication venue
Publication date: 02/09/2010
Field of study

Reinforcement Learning (RL) methods enable autonomous robots to learn skills from scratch by interacting with the environment. However, reinforcement learning can be very time consuming. This paper focuses on accelerating the reinforcement learning process on a mobile robot in an unknown environment. The presented algorithm is based on approximate policy iteration with a continuous state space and a fixed number of actions. The action-value function is represented by a weighted combination of basis functions. Furthermore, a complexity analysis is provided to show that the implemented approach is guaranteed to converge on an optimal policy with less computational time. A parallel parking task is selected for testing purposes. In the experiments, the efficiency of the proposed approach is demonstrated and analyzed through a set of simulated and real robot experiments, with comparison drawn from two well known algorithms (Dyna-Q and Q-learning)

University of Lincoln Institutional Repository

Speculative Approximations for Terascale Analytics

Author: Qin Chengjie
Rusu Florin
Publication venue
Publication date: 31/12/2014
Field of study

Model calibration is a major challenge faced by the plethora of statistical analytics packages that are increasingly used in Big Data applications. Identifying the optimal model parameters is a time-consuming process that has to be executed from scratch for every dataset/model combination even by experienced data scientists. We argue that the incapacity to evaluate multiple parameter configurations simultaneously and the lack of support to quickly identify sub-optimal configurations are the principal causes. In this paper, we develop two database-inspired techniques for efficient model calibration. Speculative parameter testing applies advanced parallel multi-query processing methods to evaluate several configurations concurrently. The number of configurations is determined adaptively at runtime, while the configurations themselves are extracted from a distribution that is continuously learned following a Bayesian process. Online aggregation is applied to identify sub-optimal configurations early in the processing by incrementally sampling the training dataset and estimating the objective function corresponding to each configuration. We design concurrent online aggregation estimators and define halting conditions to accurately and timely stop the execution. We apply the proposed techniques to distributed gradient descent optimization -- batch and incremental -- for support vector machines and logistic regression models. We implement the resulting solutions in GLADE PF-OLA -- a state-of-the-art Big Data analytics system -- and evaluate their performance over terascale-size synthetic and real datasets. The results confirm that as many as 32 configurations can be evaluated concurrently almost as fast as one, while sub-optimal configurations are detected accurately in as little as a

1/20^{\text{th}}

fraction of the time

arXiv.org e-Print Archive

eScholarship - University of California

Competitive function approximation for reinforcement learning

Author: Agostini Alejandro Gabriel
Celaya Llover Enric
Publication venue
Publication date: 01/01/2014
Field of study

The application of reinforcement learning to problems with continuous domains requires representing the value function by means of function approximation. We identify two aspects of reinforcement learning that make the function approximation process hard: non-stationarity of the target function and biased sampling. Non-stationarity is the result of the bootstrapping nature of dynamic programming where the value function is estimated using its current approximation. Biased sampling occurs when some regions of the state space are visited too often, causing a reiterated updating with similar values which fade out the occasional updates of infrequently sampled regions. We propose a competitive approach for function approximation where many different local approximators are available at a given input and the one with expectedly best approximation is selected by means of a relevance function. The local nature of the approximators allows their fast adaptation to non-stationary changes and mitigates the biased sampling problem. The coexistence of multiple approximators updated and tried in parallel permits obtaining a good estimation much faster than would be possible with a single approximator. Experiments in different benchmark problems show that the competitive strategy provides a faster and more stable learning than non-competitive approaches.Preprin

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Digital.CSIC

A Robust Approach to Optimal Matched Filter Design in Ultrasonic Non-Destructive Evaluation (NDE)

Author: Hayward Gordon
Li Minghui
Publication venue: 'AIP Publishing'
Publication date: 01/01/2016
Field of study

The matched filter was demonstrated to be a powerful yet efficient technique to enhance defect detection and imaging in ultrasonic non-destructive evaluation (NDE) of coarse grain materials, provided that the filter was properly designed and optimized. In the literature, in order to accurately approximate the defect echoes, the design utilized the real excitation signals, which made it time consuming and less straightforward to implement in practice. In this paper, we present a more robust and flexible approach to optimal matched filter design using the simulated excitation signals, and the control parameters are chosen and optimized based on the real scenario of array transducer, transmitter-receiver system response, and the test sample, as a result, the filter response is optimized and depends on the material characteristics. Experiments on industrial samples are conducted and the results confirm the great benefits of the method

Digital Repository @ Iowa State University (ISU)

Crossref

Enlighten