2 research outputs found
Provably Robust Blackbox Optimization for Reinforcement Learning
Interest in derivative-free optimization (DFO) and "evolutionary strategies"
(ES) has recently surged in the Reinforcement Learning (RL) community, with
growing evidence that they can match state of the art methods for policy
optimization problems in Robotics. However, it is well known that DFO methods
suffer from prohibitively high sampling complexity. They can also be very
sensitive to noisy rewards and stochastic dynamics. In this paper, we propose a
new class of algorithms, called Robust Blackbox Optimization (RBO). Remarkably,
even if up to of all the measurements are arbitrarily corrupted, RBO can
provably recover gradients to high accuracy. RBO relies on learning gradient
flows using robust regression methods to enable off-policy updates. On several
MuJoCo robot control tasks, when all other RL approaches collapse in the
presence of adversarial noise, RBO is able to train policies effectively. We
also show that RBO can be applied to legged locomotion tasks including path
tracking for quadruped robots
ISLET: Fast and Optimal Low-rank Tensor Regression via Importance Sketching
In this paper, we develop a novel procedure for low-rank tensor regression,
namely \emph{\underline{I}mportance \underline{S}ketching \underline{L}ow-rank
\underline{E}stimation for \underline{T}ensors} (ISLET). The central idea
behind ISLET is \emph{importance sketching}, i.e., carefully designed sketches
based on both the responses and low-dimensional structure of the parameter of
interest. We show that the proposed method is sharply minimax optimal in terms
of the mean-squared error under low-rank Tucker assumptions and under
randomized Gaussian ensemble design. In addition, if a tensor is low-rank with
group sparsity, our procedure also achieves minimax optimality. Further, we
show through numerical study that ISLET achieves comparable or better
mean-squared error performance to existing state-of-the-art methods while
having substantial storage and run-time advantages including capabilities for
parallel and distributed computing. In particular, our procedure performs
reliable estimation with tensors of dimension and is or
orders of magnitude faster than baseline methods