33 research outputs found
Oracle Inequalities for Convex Loss Functions with Non-Linear Targets
This paper consider penalized empirical loss minimization of convex loss
functions with unknown non-linear target functions. Using the elastic net
penalty we establish a finite sample oracle inequality which bounds the loss of
our estimator from above with high probability. If the unknown target is linear
this inequality also provides an upper bound of the estimation error of the
estimated parameter vector. These are new results and they generalize the
econometrics and statistics literature. Next, we use the non-asymptotic results
to show that the excess loss of our estimator is asymptotically of the same
order as that of the oracle. If the target is linear we give sufficient
conditions for consistency of the estimated parameter vector. Next, we briefly
discuss how a thresholded version of our estimator can be used to perform
consistent variable selection. We give two examples of loss functions covered
by our framework and show how penalized nonparametric series estimation is
contained as a special case and provide a finite sample upper bound on the mean
square error of the elastic net series estimator.Comment: 44 page
Power in High-Dimensional Testing Problems
Fan et al. (2015) recently introduced a remarkable method for increasing
asymptotic power of tests in high-dimensional testing problems. If applicable
to a given test, their power enhancement principle leads to an improved test
that has the same asymptotic size, uniformly non-inferior asymptotic power, and
is consistent against a strictly broader range of alternatives than the
initially given test. We study under which conditions this method can be
applied and show the following: In asymptotic regimes where the dimensionality
of the parameter space is fixed as sample size increases, there often exist
tests that can not be further improved with the power enhancement principle.
However, when the dimensionality of the parameter space increases sufficiently
slowly with sample size and a marginal local asymptotic normality (LAN)
condition is satisfied, every test with asymptotic size smaller than one can be
improved with the power enhancement principle. While the marginal LAN condition
alone does not allow one to extend the latter statement to all rates at which
the dimensionality increases with sample size, we give sufficient conditions
under which this is the case.Comment: 27 page
Moment-dependent phase transitions in high-dimensional Gaussian approximations
High-dimensional central limit theorems have been intensively studied with
most focus being on the case where the data is sub-Gaussian or sub-exponential.
However, heavier tails are omnipresent in practice. In this article, we study
the critical growth rates of dimension below which Gaussian approximations
are asymptotically valid but beyond which they are not. We are particularly
interested in how these thresholds depend on the number of moments that the
observations possess. For every , we construct i.i.d. random
vectors in , the entries of which
are independent and have a common distribution (independent of and )
with finite th absolute moment, and such that the following holds: if there
exists an such that , then the Gaussian approximation error (GAE) satisfies where . On the other hand, a result in
Chernozhukov et al. (2023a) implies that the left-hand side above is zero if
just for some . In
this sense, there is a moment-dependent phase transition at the threshold
above which the limiting GAE jumps from zero to one.Comment: After uploading the first version to arXiv, we became aware of Zhang
and Wu, (2017), Annals of Statistics. In their Remark 2, with the same method
of proof, they established a result which is essentially identical to
Equation 5 of our Theorem 2.1. We thank Moritz Jirak for pointing this out to
us. This will be incorporated in the main text in the next version of the
pape
Regularizing Discrimination in Optimal Policy Learning with Distributional Targets
A decision maker typically (i) incorporates training data to learn about the
relative effectiveness of the treatments, and (ii) chooses an implementation
mechanism that implies an "optimal" predicted outcome distribution according to
some target functional. Nevertheless, a discrimination-aware decision maker may
not be satisfied achieving said optimality at the cost of heavily
discriminating against subgroups of the population, in the sense that the
outcome distribution in a subgroup deviates strongly from the overall optimal
outcome distribution. We study a framework that allows the decision maker to
penalize for such deviations, while allowing for a wide range of target
functionals and discrimination measures to be employed. We establish regret and
consistency guarantees for empirical success policies with data-driven tuning
parameters, and provide numerical results. Furthermore, we briefly illustrate
the methods in two empirical settings