6 research outputs found
A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares)
This work provides a simplified proof of the statistical minimax
optimality of (iterate averaged) stochastic gradient descent (SGD), for
the special case of least squares. This result is obtained by
analyzing SGD as a stochastic process and by sharply characterizing
the stationary covariance matrix of this process. The finite rate optimality characterization captures the
constant factors and addresses model mis-specification
Robust Aggregation for Federated Learning
We present a robust aggregation approach to make federated learning robust to
settings when a fraction of the devices may be sending corrupted updates to the
server. The proposed approach relies on a robust secure aggregation oracle
based on the geometric median, which returns a robust aggregate using a
constant number of calls to a regular non-robust secure average oracle. The
robust aggregation oracle is privacy-preserving, similar to the secure average
oracle it builds upon. We provide experimental results of the proposed approach
with linear models and deep networks for two tasks in computer vision and
natural language processing. The robust aggregation approach is agnostic to the
level of corruption; it outperforms the classical aggregation approach in terms
of robustness when the level of corruption is high, while being competitive in
the regime of low corruption