20 research outputs found
Hypocoercivity properties of adaptive Langevin dynamics
International audienceAdaptive Langevin dynamics is a method for sampling the Boltzmann-Gibbs distribution at prescribed temperature in cases where the potential gradient is subject to stochastic perturbation of unknown magnitude. The method replaces the friction in underdamped Langevin dynamics with a dynamical variable, updated according to a negative feedback loop control law as in the Nose-Hoover thermostat. Using a hypocoercivity analysis we show that the law of Adaptive Langevin dynamics converges exponentially rapidly to the stationary distribution, with a rate that can be quantified in terms of the key parameters of the dynamics. This allows us in particular to obtain a central limit theorem with respect to the time averages computed along a stochastic path. Our theoretical findings are illustrated by numerical simulations involving classification of the MNIST data set of handwritten digits using Bayesian logistic regression
Affine Invariant Covariance Estimation for Heavy-Tailed Distributions
In this work we provide an estimator for the covariance matrix of a
heavy-tailed multivariate distributionWe prove that the proposed estimator
admits an \textit{affine-invariant} bound of the form
in high probability, where is the
unknown covariance matrix, and is the positive semidefinite
order on symmetric matrices. The result only requires the existence of
fourth-order moments, and allows for where is a measure of kurtosis of the
distribution, is the dimensionality of the space, is the sample size,
and is the desired confidence level. More generally, we can allow
for regularization with level , then gets replaced with the
degrees of freedom number. Denoting the condition
number of , the computational cost of the novel estimator is , which is comparable to the cost of the
sample covariance estimator in the statistically interesing regime .
We consider applications of our estimator to eigenvalue estimation with
relative error, and to ridge regression with heavy-tailed random design
S-GBDT: Frugal Differentially Private Gradient Boosting Decision Trees
Privacy-preserving learning of gradient boosting decision trees (GBDT) has
the potential for strong utility-privacy tradeoffs for tabular data, such as
census data or medical meta data: classical GBDT learners can extract
non-linear patterns from small sized datasets. The state-of-the-art notion for
provable privacy-properties is differential privacy, which requires that the
impact of single data points is limited and deniable. We introduce a novel
differentially private GBDT learner and utilize four main techniques to improve
the utility-privacy tradeoff. (1) We use an improved noise scaling approach
with tighter accounting of privacy leakage of a decision tree leaf compared to
prior work, resulting in noise that in expectation scales with , for
data points. (2) We integrate individual R\'enyi filters to our method to
learn from data points that have been underutilized during an iterative
training process, which -- potentially of independent interest -- results in a
natural yet effective insight to learning streams of non-i.i.d. data. (3) We
incorporate the concept of random decision tree splits to concentrate privacy
budget on learning leaves. (4) We deploy subsampling for privacy amplification.
Our evaluation shows for the Abalone dataset ( training data points) a
-score of for , which the closest prior work only
achieved for . On the Adult dataset ( training data
points) we achieve test error of for which the
closest prior work only achieved for . For the Abalone dataset
for we achieve -score of which is very close to
the -score of for the nonprivate version of GBDT. For the Adult
dataset for we achieve test error which is very
close to the test error of the nonprivate version of GBDT.Comment: The first two authors equally contributed to this wor
Recommended from our members
Evolution and Competition of Block Copolymer Nanoparticles
Nanoparticle structures formed in a mixture of diblock copolymer and solvent are investigated using a three-phase density functional model and its sharp interface approximation. A wide variety of equilibria described by localized domain patterns are quantified both numerically and analytically. Competition among multiple particles is shown to occur through mass diffusion driven by differences in chemical potential, which may or may not lead to Ostwald ripening behavior. Late stage rigid body dynamics is shown to result from interaction through dipolar fields, leading to orientational alignment and long-range attraction.NSF [DMS-1514689]This item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at [email protected]