106 research outputs found
Projective simulation for artificial intelligence
We propose a model of a learning agent whose interaction with the environment
is governed by a simulation-based projection, which allows the agent to project
itself into future situations before it takes real action. Projective
simulation is based on a random walk through a network of clips, which are
elementary patches of episodic memory. The network of clips changes
dynamically, both due to new perceptual input and due to certain compositional
principles of the simulation process. During simulation, the clips are screened
for specific features which trigger factual action of the agent. The scheme is
different from other, computational, notions of simulation, and it provides a
new element in an embodied cognitive science approach to intelligent action and
learning. Our model provides a natural route for generalization to
quantum-mechanical operation and connects the fields of reinforcement learning
and quantum computation.Comment: 22 pages, 18 figures. Close to published version, with footnotes
retaine
Sequential Quasi-Monte Carlo
We derive and study SQMC (Sequential Quasi-Monte Carlo), a class of
algorithms obtained by introducing QMC point sets in particle filtering. SQMC
is related to, and may be seen as an extension of, the array-RQMC algorithm of
L'Ecuyer et al. (2006). The complexity of SQMC is , where is
the number of simulations at each iteration, and its error rate is smaller than
the Monte Carlo rate . The only requirement to implement SQMC is
the ability to write the simulation of particle given as a
deterministic function of and a fixed number of uniform variates.
We show that SQMC is amenable to the same extensions as standard SMC, such as
forward smoothing, backward smoothing, unbiased likelihood evaluation, and so
on. In particular, SQMC may replace SMC within a PMCMC (particle Markov chain
Monte Carlo) algorithm. We establish several convergence results. We provide
numerical evidence that SQMC may significantly outperform SMC in practical
scenarios.Comment: 55 pages, 10 figures (final version
Regularized fitted Q-iteration: application to planning
We consider planning in a Markovian decision problem, i.e., the problem of finding a good policy given access to a generative model of the environment. We propose to use fitted Q-iteration with penalized (or regularized) least-squares regression as the regression subroutine to address the problem of controlling model-complexity. The algorithm is presented in detail for the case when the function space is a reproducing kernel Hilbert space underlying a user-chosen kernel function. We derive bounds on the quality of the solution and argue that data-dependent penalties can lead to almost optimal performance. A simple example is used to illustrate the benefits of using a penalized procedure
The velocity distribution of nearby stars from Hipparcos data I. The significance of the moving groups
We present a three-dimensional reconstruction of the velocity distribution of
nearby stars (<~ 100 pc) using a maximum likelihood density estimation
technique applied to the two-dimensional tangential velocities of stars. The
underlying distribution is modeled as a mixture of Gaussian components. The
algorithm reconstructs the error-deconvolved distribution function, even when
the individual stars have unique error and missing-data properties. We apply
this technique to the tangential velocity measurements from a kinematically
unbiased sample of 11,865 main sequence stars observed by the Hipparcos
satellite. We explore various methods for validating the complexity of the
resulting velocity distribution function, including criteria based on Bayesian
model selection and how accurately our reconstruction predicts the radial
velocities of a sample of stars from the Geneva-Copenhagen survey (GCS). Using
this very conservative external validation test based on the GCS, we find that
there is little evidence for structure in the distribution function beyond the
moving groups established prior to the Hipparcos mission. This is in sharp
contrast with internal tests performed here and in previous analyses, which
point consistently to maximal structure in the velocity distribution. We
quantify the information content of the radial velocity measurements and find
that the mean amount of new information gained from a radial velocity
measurement of a single star is significant. This argues for complementary
radial velocity surveys to upcoming astrometric surveys
Deep Reinforcement Learning: An Overview
In recent years, a specific machine learning method called deep learning has
gained huge attraction, as it has obtained astonishing results in broad
applications such as pattern recognition, speech recognition, computer vision,
and natural language processing. Recent research has also been shown that deep
learning techniques can be combined with reinforcement learning methods to
learn useful representations for the problems with high dimensional raw data
input. This chapter reviews the recent advances in deep reinforcement learning
with a focus on the most used deep architectures such as autoencoders,
convolutional neural networks and recurrent neural networks which have
successfully been come together with the reinforcement learning framework.Comment: Proceedings of SAI Intelligent Systems Conference (IntelliSys) 201
Primer on using neural networks for forecasting market variables
Author's OriginalAbility to forecast market variables is critical to analysts, economists and investors. Among other uses, neural networks are gaining in popularity in forecasting market variables. They are used in various disciplines and issues to map complex relationships.
We present a primer for using neural networks for forecasting market variables in general, and in particular, forecasting volatility of the S&P 500 Index futures prices. We compare volatility forecasts from neural networks with implied volatility from S&P 500 Index futures options using the Barone-Adesi and Whaley (BAW) model for pricing American options on futures. Forecasts from neural networks outperform implied volatility forecasts. Volatility forecasts from neural networks are not found to be significantly different from realized volatility. Implied volatility forecasts are found to be significantly different from realized volatility in two of three cases.
A revised version of this paper has since been published in the Journal of Business Research. Please use this version in your citations.Hamid, S. A. & Iqbal, Zahid. (2004). Using Neural Networks for Forecasting Volatility of S&P 500 Index Futures Prices. Journal of Business Research, 57(10), 1116-1125
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
We consider the problem of finding a near-optimal policy in continuous space, discounted Markovian Decision Problems given the trajectory of some behaviour policy. We study the policy iteration algorithm where in successive iterations the action-value functions of the intermediate policies are obtained by picking a function from some fixed function set (chosen by the user) that minimizes an unbiased finite-sample approximation to a novel loss function that upper-bounds the unmodified Bellman-residual criterion. The main result is a finite-sample, high-probability bound on the performance of the resulting policy that depends on the mixing rate of the trajectory, the capacity of the function set as measured by a novel capacity concept that we call the VC-crossing dimension, the approximation power of the function set and the discounted-average concentrability of the future-state distribution. To the best of our knowledge this is the first theoretical reinforcement learning result for off-policy control learning over continuous state-spaces using a single trajectory
- …