49,710 research outputs found
Learning Linear Dynamical Systems via Spectral Filtering
We present an efficient and practical algorithm for the online prediction of
discrete-time linear dynamical systems with a symmetric transition matrix. We
circumvent the non-convex optimization problem using improper learning:
carefully overparameterize the class of LDSs by a polylogarithmic factor, in
exchange for convexity of the loss functions. From this arises a
polynomial-time algorithm with a near-optimal regret guarantee, with an
analogous sample complexity bound for agnostic learning. Our algorithm is based
on a novel filtering technique, which may be of independent interest: we
convolve the time series with the eigenvectors of a certain Hankel matrix.Comment: Published as a conference paper at NIPS 201
Surrogate Losses for Online Learning of Stepsizes in Stochastic Non-Convex Optimization
Stochastic Gradient Descent (SGD) has played a central role in machine
learning. However, it requires a carefully hand-picked stepsize for fast
convergence, which is notoriously tedious and time-consuming to tune. Over the
last several years, a plethora of adaptive gradient-based algorithms have
emerged to ameliorate this problem. They have proved efficient in reducing the
labor of tuning in practice, but many of them lack theoretic guarantees even in
the convex setting. In this paper, we propose new surrogate losses to cast the
problem of learning the optimal stepsizes for the stochastic optimization of a
non-convex smooth objective function onto an online convex optimization
problem. This allows the use of no-regret online algorithms to compute optimal
stepsizes on the fly. In turn, this results in a SGD algorithm with self-tuned
stepsizes that guarantees convergence rates that are automatically adaptive to
the level of noise
Non-stationary Online Learning with Memory and Non-stochastic Control
We study the problem of Online Convex Optimization (OCO) with memory, which
allows loss functions to depend on past decisions and thus captures temporal
effects of learning problems. In this paper, we introduce dynamic policy regret
as the performance measure to design algorithms robust to non-stationary
environments, which competes algorithms' decisions with a sequence of changing
comparators. We propose a novel algorithm for OCO with memory that provably
enjoys an optimal dynamic policy regret. The key technical challenge is how to
control the switching cost, the cumulative movements of player's decisions,
which is neatly addressed by a novel decomposition of dynamic policy regret and
an appropriate meta-expert structure. Furthermore, we apply the results to the
problem of online non-stochastic control, i.e., controlling a linear dynamical
system with adversarial disturbance and convex loss functions. We derive a
novel gradient-based controller with dynamic policy regret guarantees, which is
the first controller competitive to a sequence of changing policies
Oracle Efficient Online Multicalibration and Omniprediction
A recent line of work has shown a surprising connection between
multicalibration, a multi-group fairness notion, and omniprediction, a learning
paradigm that provides simultaneous loss minimization guarantees for a large
family of loss functions. Prior work studies omniprediction in the batch
setting. We initiate the study of omniprediction in the online adversarial
setting. Although there exist algorithms for obtaining notions of
multicalibration in the online adversarial setting, unlike batch algorithms,
they work only for small finite classes of benchmark functions , because
they require enumerating every function at every round. In contrast,
omniprediction is most interesting for learning theoretic hypothesis classes
, which are generally continuously large.
We develop a new online multicalibration algorithm that is well defined for
infinite benchmark classes , and is oracle efficient (i.e. for any class
, the algorithm has the form of an efficient reduction to a no-regret
learning algorithm for ). The result is the first efficient online
omnipredictor -- an oracle efficient prediction algorithm that can be used to
simultaneously obtain no regret guarantees to all Lipschitz convex loss
functions. For the class of linear functions, we show how to make our
algorithm efficient in the worst case. Also, we show upper and lower bounds on
the extent to which our rates can be improved: our oracle efficient algorithm
actually promises a stronger guarantee called swap-omniprediction, and we prove
a lower bound showing that obtaining bounds for
swap-omniprediction is impossible in the online setting. On the other hand, we
give a (non-oracle efficient) algorithm which can obtain the optimal
omniprediction bounds without going through multicalibration,
giving an information theoretic separation between these two solution concepts
A Modern Introduction to Online Learning
In this monograph, I introduce the basic concepts of Online Learning through
a modern view of Online Convex Optimization. Here, online learning refers to
the framework of regret minimization under worst-case assumptions. I present
first-order and second-order algorithms for online learning with convex losses,
in Euclidean and non-Euclidean settings. All the algorithms are clearly
presented as instantiation of Online Mirror Descent or
Follow-The-Regularized-Leader and their variants. Particular attention is given
to the issue of tuning the parameters of the algorithms and learning in
unbounded domains, through adaptive and parameter-free online learning
algorithms. Non-convex losses are dealt through convex surrogate losses and
through randomization. The bandit setting is also briefly discussed, touching
on the problem of adversarial and stochastic multi-armed bandits. These notes
do not require prior knowledge of convex analysis and all the required
mathematical tools are rigorously explained. Moreover, all the proofs have been
carefully chosen to be as simple and as short as possible.Comment: Fixed more typos, added more history bits, added local norms bounds
for OMD and FTR
- …