45,724 research outputs found
Competitive function approximation for reinforcement learning
The application of reinforcement learning to problems with continuous domains requires representing the value function by means of function approximation. We identify two aspects of reinforcement learning that make the function approximation process hard: non-stationarity of the target function and biased sampling. Non-stationarity is the result of the bootstrapping nature of dynamic programming where the value function is estimated using its current approximation. Biased sampling occurs when some regions of the state space are visited too often, causing a reiterated updating with similar values which fade out the occasional updates of infrequently sampled regions.
We propose a competitive approach for function approximation where many different local approximators are available at a given input and the one with expectedly best approximation is selected by means of a relevance function. The local nature of the approximators allows their fast adaptation to non-stationary changes and mitigates the biased sampling problem. The coexistence of multiple approximators updated and tried in parallel permits obtaining a good estimation much faster than would be possible with a single approximator. Experiments in different benchmark problems show that the competitive strategy provides a faster and more stable learning than non-competitive approaches.Preprin
Tracking-Based Non-Parametric Background-Foreground Classification in a Chromaticity-Gradient Space
This work presents a novel background-foreground classification technique based on adaptive non-parametric kernel estimation in a color-gradient space of components. By combining normalized color components with their gradients, shadows are efficiently suppressed from the results, while the luminance information in the moving objects is preserved. Moreover, a fast multi-region iterative tracking strategy applied over previously detected foreground regions allows to construct a robust foreground modeling, which combined with the background model increases noticeably the quality in the detections. The proposed strategy has been applied to different kind of sequences, obtaining satisfactory results in complex situations such as those given by dynamic backgrounds, illumination changes, shadows and multiple moving objects
Multi-agents adaptive estimation and coverage control using Gaussian regression
We consider a scenario where the aim of a group of agents is to perform the
optimal coverage of a region according to a sensory function. In particular,
centroidal Voronoi partitions have to be computed. The difficulty of the task
is that the sensory function is unknown and has to be reconstructed on line
from noisy measurements. Hence, estimation and coverage needs to be performed
at the same time. We cast the problem in a Bayesian regression framework, where
the sensory function is seen as a Gaussian random field. Then, we design a set
of control inputs which try to well balance coverage and estimation, also
discussing convergence properties of the algorithm. Numerical experiments show
the effectivness of the new approach
Sequential Design for Optimal Stopping Problems
We propose a new approach to solve optimal stopping problems via simulation.
Working within the backward dynamic programming/Snell envelope framework, we
augment the methodology of Longstaff-Schwartz that focuses on approximating the
stopping strategy. Namely, we introduce adaptive generation of the stochastic
grids anchoring the simulated sample paths of the underlying state process.
This allows for active learning of the classifiers partitioning the state space
into the continuation and stopping regions. To this end, we examine sequential
design schemes that adaptively place new design points close to the stopping
boundaries. We then discuss dynamic regression algorithms that can implement
such recursive estimation and local refinement of the classifiers. The new
algorithm is illustrated with a variety of numerical experiments, showing that
an order of magnitude savings in terms of design size can be achieved. We also
compare with existing benchmarks in the context of pricing multi-dimensional
Bermudan options.Comment: 24 page
Finding Structural Information of RF Power Amplifiers using an Orthogonal Non-Parametric Kernel Smoothing Estimator
A non-parametric technique for modeling the behavior of power amplifiers is
presented. The proposed technique relies on the principles of density
estimation using the kernel method and is suited for use in power amplifier
modeling. The proposed methodology transforms the input domain into an
orthogonal memory domain. In this domain, non-parametric static functions are
discovered using the kernel estimator. These orthogonal, non-parametric
functions can be fitted with any desired mathematical structure, thus
facilitating its implementation. Furthermore, due to the orthogonality, the
non-parametric functions can be analyzed and discarded individually, which
simplifies pruning basis functions and provides a tradeoff between complexity
and performance. The results show that the methodology can be employed to model
power amplifiers, therein yielding error performance similar to
state-of-the-art parametric models. Furthermore, a parameter-efficient model
structure with 6 coefficients was derived for a Doherty power amplifier,
therein significantly reducing the deployment's computational complexity.
Finally, the methodology can also be well exploited in digital linearization
techniques.Comment: Matlab sample code (15 MB):
https://dl.dropboxusercontent.com/u/106958743/SampleMatlabKernel.zi
- …