63 research outputs found
A Bernstein-von Mises theorem in the nonparametric right-censoring model
In the recent Bayesian nonparametric literature, many examples have been
reported in which Bayesian estimators and posterior distributions do not
achieve the optimal convergence rate, indicating that the Bernstein-von
Mises theorem does not hold. In this article, we give a positive result in
this direction by showing that the Bernstein-von Mises theorem holds in
survival models for a large class of prior processes neutral to the right. We
also show that, for an arbitrarily given convergence rate n^{-\alpha} with
0<\alpha \leq 1/2, a prior process neutral to the right can be chosen so that
its posterior distribution achieves the convergence rate n^{-\alpha}.Comment: Published by the Institute of Mathematical Statistics
(http://www.imstat.org) in the Annals of Statistics
(http://www.imstat.org/aos/) at http://dx.doi.org/10.1214/00905360400000052
Calibrating nonconvex penalized regression in ultra-high dimension
We investigate high-dimensional nonconvex penalized regression, where the
number of covariates may grow at an exponential rate. Although recent
asymptotic theory established that there exists a local minimum possessing the
oracle property under general conditions, it is still largely an open problem
how to identify the oracle estimator among potentially multiple local minima.
There are two main obstacles: (1) due to the presence of multiple minima, the
solution path is nonunique and is not guaranteed to contain the oracle
estimator; (2) even if a solution path is known to contain the oracle
estimator, the optimal tuning parameter depends on many unknown factors and is
hard to estimate. To address these two challenging issues, we first prove that
an easy-to-calculate calibrated CCCP algorithm produces a consistent solution
path which contains the oracle estimator with probability approaching one.
Furthermore, we propose a high-dimensional BIC criterion and show that it can
be applied to the solution path to select the optimal tuning parameter which
asymptotically identifies the oracle estimator. The theory for a general class
of nonconvex penalties in the ultra-high dimensional setup is established when
the random errors follow the sub-Gaussian distribution. Monte Carlo studies
confirm that the calibrated CCCP algorithm combined with the proposed
high-dimensional BIC has desirable performance in identifying the underlying
sparsity pattern for high-dimensional data analysis.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1159 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Within-group fairness: A guidance for more sound between-group fairness
As they have a vital effect on social decision-making, AI algorithms not only
should be accurate and but also should not pose unfairness against certain
sensitive groups (e.g., non-white, women). Various specially designed AI
algorithms to ensure trained AI models to be fair between sensitive groups have
been developed. In this paper, we raise a new issue that between-group fair AI
models could treat individuals in a same sensitive group unfairly. We introduce
a new concept of fairness so-called within-group fairness which requires that
AI models should be fair for those in a same sensitive group as well as those
in different sensitive groups. We materialize the concept of within-group
fairness by proposing corresponding mathematical definitions and developing
learning algorithms to control within-group fairness and between-group fairness
simultaneously. Numerical studies show that the proposed learning algorithms
improve within-group fairness without sacrificing accuracy as well as
between-group fairness
Smooth function approximation by deep neural networks with general activation functions
There has been a growing interest in expressivity of deep neural networks.
However, most of the existing work about this topic focuses only on the
specific activation function such as ReLU or sigmoid. In this paper, we
investigate the approximation ability of deep neural networks with a broad
class of activation functions. This class of activation functions includes most
of frequently used activation functions. We derive the required depth, width
and sparsity of a deep neural network to approximate any H\"older smooth
function upto a given approximation error for the large class of activation
functions. Based on our approximation error analysis, we derive the minimax
optimality of the deep neural network estimators with the general activation
functions in both regression and classification problems.Comment: 24 page
Improving Performance of Semi-Supervised Learning by Adversarial Attacks
Semi-supervised learning (SSL) algorithm is a setup built upon a realistic
assumption that access to a large amount of labeled data is tough. In this
study, we present a generalized framework, named SCAR, standing for Selecting
Clean samples with Adversarial Robustness, for improving the performance of
recent SSL algorithms. By adversarially attacking pre-trained models with
semi-supervision, our framework shows substantial advances in classifying
images. We introduce how adversarial attacks successfully select high-confident
unlabeled data to be labeled with current predictions. On CIFAR10, three recent
SSL algorithms with SCAR result in significantly improved image classification.Comment: 4 page
- …