32,376 research outputs found
Learning theory estimates with observations from general stationary stochastic processes
This paper investigates the supervised learning problem with observations
drawn from certain general stationary stochastic processes. Here by
\emph{general}, we mean that many stationary stochastic processes can be
included. We show that when the stochastic processes satisfy a generalized
Bernstein-type inequality, a unified treatment on analyzing the learning
schemes with various mixing processes can be conducted and a sharp oracle
inequality for generic regularized empirical risk minimization schemes can be
established. The obtained oracle inequality is then applied to derive
convergence rates for several learning schemes such as empirical risk
minimization (ERM), least squares support vector machines (LS-SVMs) using given
generic kernels, and SVMs using Gaussian kernels for both least squares and
quantile regression. It turns out that for i.i.d.~processes, our learning rates
for ERM recover the optimal rates. On the other hand, for non-i.i.d.~processes
including geometrically -mixing Markov processes, geometrically
-mixing processes with restricted decay, -mixing processes, and
(time-reversed) geometrically -mixing processes, our learning
rates for SVMs with Gaussian kernels match, up to some arbitrarily small extra
term in the exponent, the optimal rates. For the remaining cases, our rates are
at least close to the optimal rates. As a by-product, the assumed generalized
Bernstein-type inequality also provides an interpretation of the so-called
"effective number of observations" for various mixing processes.Comment: arXiv admin note: text overlap with arXiv:1501.0305
Learning from dependent observations
In most papers establishing consistency for learning algorithms it is assumed
that the observations used for training are realizations of an i.i.d. process.
In this paper we go far beyond this classical framework by showing that support
vector machines (SVMs) essentially only require that the data-generating
process satisfies a certain law of large numbers. We then consider the
learnability of SVMs for \a-mixing (not necessarily stationary) processes for
both classification and regression, where for the latter we explicitly allow
unbounded noise.Comment: submitted to Journal of Multivariate Analysi
Reinforcement Learning: Stochastic Approximation Algorithms for Markov Decision Processes
This article presents a short and concise description of stochastic
approximation algorithms in reinforcement learning of Markov decision
processes. The algorithms can also be used as a suboptimal method for partially
observed Markov decision processes
Fast Bayesian inference of the multivariate Ornstein-Uhlenbeck process
The multivariate Ornstein-Uhlenbeck process is used in many branches of
science and engineering to describe the regression of a system to its
stationary mean. Here we present an Bayesian method to estimate the
drift and diffusion matrices of the process from discrete observations of a
sample path. We use exact likelihoods, expressed in terms of four sufficient
statistic matrices, to derive explicit maximum a posteriori parameter estimates
and their standard errors. We apply the method to the Brownian harmonic
oscillator, a bivariate Ornstein-Uhlenbeck process, to jointly estimate its
mass, damping, and stiffness and to provide Bayesian estimates of the
correlation functions and power spectral densities. We present a Bayesian model
comparison procedure, embodying Ockham's razor, to guide a data-driven choice
between the Kramers and Smoluchowski limits of the oscillator. These provide
novel methods of analyzing the inertial motion of colloidal particles in
optical traps.Comment: add published versio
Managing engineering systems with large state and action spaces through deep reinforcement learning
Decision-making for engineering systems can be efficiently formulated as a
Markov Decision Process (MDP) or a Partially Observable MDP (POMDP). Typical
MDP and POMDP solution procedures utilize offline knowledge about the
environment and provide detailed policies for relatively small systems with
tractable state and action spaces. However, in large multi-component systems
the sizes of these spaces easily explode, as system states and actions scale
exponentially with the number of components, whereas environment dynamics are
difficult to be described in explicit forms for the entire system and may only
be accessible through numerical simulators. In this work, to address these
issues, an integrated Deep Reinforcement Learning (DRL) framework is
introduced. The Deep Centralized Multi-agent Actor Critic (DCMAC) is developed,
an off-policy actor-critic DRL approach, providing efficient life-cycle
policies for large multi-component systems operating in high-dimensional
spaces. Apart from deep function approximations that parametrize large state
spaces, DCMAC also adopts a factorized representation of the system actions,
being able to designate individualized component- and subsystem-level
decisions, while maintaining a centralized value function for the entire
system. DCMAC compares well against Deep Q-Network (DQN) solutions and exact
policies, where applicable, and outperforms optimized baselines that are based
on time-based, condition-based and periodic policies
A Bernstein-type Inequality for Some Mixing Processes and Dynamical Systems with an Application to Learning
We establish a Bernstein-type inequality for a class of stochastic processes
that include the classical geometrically -mixing processes, Rio's
generalization of these processes, as well as many time-discrete dynamical
systems. Modulo a logarithmic factor and some constants, our Bernstein-type
inequality coincides with the classical Bernstein inequality for i.i.d.~data.
We further use this new Bernstein-type inequality to derive an oracle
inequality for generic regularized empirical risk minimization algorithms and
data generated by such processes. Applying this oracle inequality to support
vector machines using the Gaussian kernels for both least squares and quantile
regression, it turns out that the resulting learning rates match, up to some
arbitrarily small extra term in the exponent, the optimal rates for
i.i.d.~processes
A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity
The key challenge in multiagent learning is learning a best response to the
behaviour of other agents, which may be non-stationary: if the other agents
adapt their strategy as well, the learning target moves. Disparate streams of
research have approached non-stationarity from several angles, which make a
variety of implicit assumptions that make it hard to keep an overview of the
state of the art and to validate the innovation and significance of new works.
This survey presents a coherent overview of work that addresses
opponent-induced non-stationarity with tools from game theory, reinforcement
learning and multi-armed bandits. Further, we reflect on the principle
approaches how algorithms model and cope with this non-stationarity, arriving
at a new framework and five categories (in increasing order of sophistication):
ignore, forget, respond to target models, learn models, and theory of mind. A
wide range of state-of-the-art algorithms is classified into a taxonomy, using
these categories and key characteristics of the environment (e.g.,
observability) and adaptation behaviour of the opponents (e.g., smooth,
abrupt). To clarify even further we present illustrative variations of one
domain, contrasting the strengths and limitations of each category. Finally, we
discuss in which environments the different approaches yield most merit, and
point to promising avenues of future research.Comment: 64 pages, 7 figures. Under review since November 201
On Parameter Estimation of Hidden Telegraph Process
The problem of parameter estimation by the observations of the two-state
telegraph process in the presence of white Gaussian noise is considered. The
properties of estimator of the method of moments are described in the
asymptotics of large samples. Then this estimator is used as preliminary one to
construct the one-step MLE-process, which provides the asymptotically normal
and asymptotically efficient estimation of the unknown parameters.Comment: 26 page
Bayesian Nonparametric Spectral Estimation
Spectral estimation (SE) aims to identify how the energy of a signal (e.g., a
time series) is distributed across different frequencies. This can become
particularly challenging when only partial and noisy observations of the signal
are available, where current methods fail to handle uncertainty appropriately.
In this context, we propose a joint probabilistic model for signals,
observations and spectra, where SE is addressed as an exact inference problem.
Assuming a Gaussian process prior over the signal, we apply Bayes' rule to find
the analytic posterior distribution of the spectrum given a set of
observations. Besides its expressiveness and natural account of spectral
uncertainty, the proposed model also provides a functional-form representation
of the power spectral density, which can be optimised efficiently. Comparison
with previous approaches, in particular against Lomb-Scargle, is addressed
theoretically and also experimentally in three different scenarios. Code and
demo available at https://github.com/GAMES-UChile/BayesianSpectralEstimation.Comment: 11 pages. In Advances in Neural Information Processing Systems, 201
Nonparametric Online Learning Using Lipschitz Regularized Deep Neural Networks
Deep neural networks are considered to be state of the art models in many
offline machine learning tasks. However, their performance and generalization
abilities in online learning tasks are much less understood. Therefore, we
focus on online learning and tackle the challenging problem where the
underlying process is stationary and ergodic and thus removing the i.i.d.
assumption and allowing observations to depend on each other arbitrarily. We
prove the generalization abilities of Lipschitz regularized deep neural
networks and show that by using those networks, a convergence to the best
possible prediction strategy is guaranteed
- …