9 research outputs found
Stability and Deviation Optimal Risk Bounds with Convergence Rate
The sharpest known high probability generalization bounds for uniformly
stable algorithms (Feldman, Vondr\'{a}k, 2018, 2019), (Bousquet, Klochkov,
Zhivotovskiy, 2020) contain a generally inevitable sampling error term of order
. When applied to excess risk bounds, this leads to
suboptimal results in several standard stochastic convex optimization problems.
We show that if the so-called Bernstein condition is satisfied, the term
can be avoided, and high probability excess risk bounds of
order up to are possible via uniform stability. Using this result, we
show a high probability excess risk bound with the rate for
strongly convex and Lipschitz losses valid for \emph{any} empirical risk
minimization method. This resolves a question of Shalev-Shwartz, Shamir,
Srebro, and Sridharan (2009). We discuss how high probability
excess risk bounds are possible for projected gradient descent in the case of
strongly convex and Lipschitz losses without the usual smoothness assumption.Comment: 12 pages; presented at NeurIP
Non-Euclidean Differentially Private Stochastic Convex Optimization
Differentially private (DP) stochastic convex optimization (SCO) is a
fundamental problem, where the goal is to approximately minimize the population
risk with respect to a convex loss function, given a dataset of i.i.d. samples
from a distribution, while satisfying differential privacy with respect to the
dataset. Most of the existing works in the literature of private convex
optimization focus on the Euclidean (i.e., ) setting, where the loss is
assumed to be Lipschitz (and possibly smooth) w.r.t. the norm over a
constraint set with bounded diameter. Algorithms based on noisy
stochastic gradient descent (SGD) are known to attain the optimal excess risk
in this setting.
In this work, we conduct a systematic study of DP-SCO for -setups.
For , under a standard smoothness assumption, we give a new algorithm with
nearly optimal excess risk. This result also extends to general polyhedral
norms and feasible sets. For , we give two new algorithms, whose
central building block is a novel privacy mechanism, which generalizes the
Gaussian mechanism. Moreover, we establish a lower bound on the excess risk for
this range of , showing a necessary dependence on , where is
the dimension of the space. Our lower bound implies a sudden transition of the
excess risk at , where the dependence on changes from logarithmic to
polynomial, resolving an open question in prior work [TTZ15] . For , noisy SGD attains optimal excess risk in the low-dimensional regime;
in particular, this proves the optimality of noisy SGD for . Our work
draws upon concepts from the geometry of normed spaces, such as the notions of
regularity, uniform convexity, and uniform smoothness
A Full Characterization of Excess Risk via Empirical Risk Landscape
In this paper, we provide a unified analysis of the excess risk of the model
trained by a proper algorithm with both smooth convex and non-convex loss
functions. In contrast to the existing bounds in the literature that depends on
iteration steps, our bounds to the excess risk do not diverge with the number
of iterations. This underscores that, at least for smooth loss functions, the
excess risk can be guaranteed after training. To get the bounds to excess risk,
we develop a technique based on algorithmic stability and non-asymptotic
characterization of the empirical risk landscape. The model obtained by a
proper algorithm is proved to generalize with this technique. Specifically, for
non-convex loss, the conclusion is obtained via the technique and analyzing the
stability of a constructed auxiliary algorithm. Combining this with some
properties of the empirical risk landscape, we derive converged upper bounds to
the excess risk in both convex and non-convex regime with the help of some
classical optimization results.Comment: 38page