33 research outputs found

    Oracle Inequalities for Convex Loss Functions with Non-Linear Targets

    Full text link
    This paper consider penalized empirical loss minimization of convex loss functions with unknown non-linear target functions. Using the elastic net penalty we establish a finite sample oracle inequality which bounds the loss of our estimator from above with high probability. If the unknown target is linear this inequality also provides an upper bound of the estimation error of the estimated parameter vector. These are new results and they generalize the econometrics and statistics literature. Next, we use the non-asymptotic results to show that the excess loss of our estimator is asymptotically of the same order as that of the oracle. If the target is linear we give sufficient conditions for consistency of the estimated parameter vector. Next, we briefly discuss how a thresholded version of our estimator can be used to perform consistent variable selection. We give two examples of loss functions covered by our framework and show how penalized nonparametric series estimation is contained as a special case and provide a finite sample upper bound on the mean square error of the elastic net series estimator.Comment: 44 page

    Power in High-Dimensional Testing Problems

    Full text link
    Fan et al. (2015) recently introduced a remarkable method for increasing asymptotic power of tests in high-dimensional testing problems. If applicable to a given test, their power enhancement principle leads to an improved test that has the same asymptotic size, uniformly non-inferior asymptotic power, and is consistent against a strictly broader range of alternatives than the initially given test. We study under which conditions this method can be applied and show the following: In asymptotic regimes where the dimensionality of the parameter space is fixed as sample size increases, there often exist tests that can not be further improved with the power enhancement principle. However, when the dimensionality of the parameter space increases sufficiently slowly with sample size and a marginal local asymptotic normality (LAN) condition is satisfied, every test with asymptotic size smaller than one can be improved with the power enhancement principle. While the marginal LAN condition alone does not allow one to extend the latter statement to all rates at which the dimensionality increases with sample size, we give sufficient conditions under which this is the case.Comment: 27 page

    Moment-dependent phase transitions in high-dimensional Gaussian approximations

    Full text link
    High-dimensional central limit theorems have been intensively studied with most focus being on the case where the data is sub-Gaussian or sub-exponential. However, heavier tails are omnipresent in practice. In this article, we study the critical growth rates of dimension dd below which Gaussian approximations are asymptotically valid but beyond which they are not. We are particularly interested in how these thresholds depend on the number of moments mm that the observations possess. For every m∈(2,∞)m\in(2,\infty), we construct i.i.d. random vectors X1,...,Xn\textbf{X}_1,...,\textbf{X}_n in Rd\mathbb{R}^d, the entries of which are independent and have a common distribution (independent of nn and dd) with finite mmth absolute moment, and such that the following holds: if there exists an Ρ∈(0,∞)\varepsilon\in(0,\infty) such that d/nm/2βˆ’1+Ξ΅β†’ΜΈ0d/n^{m/2-1+\varepsilon}\not\to 0, then the Gaussian approximation error (GAE) satisfies lim sup⁑nβ†’βˆžsup⁑t∈R[P(max⁑1≀j≀d1nβˆ‘i=1nXij≀t)βˆ’P(max⁑1≀j≀dZj≀t)]=1, \limsup_{n\to\infty}\sup_{t\in\mathbb{R}}\left[\mathbb{P}\left(\max_{1\leq j\leq d}\frac{1}{\sqrt{n}}\sum_{i=1}^n\textbf{X}_{ij}\leq t\right)-\mathbb{P}\left(\max_{1\leq j\leq d}\textbf{Z}_j\leq t\right)\right]=1, where Z∼Nd(0d,Id)\textbf{Z} \sim \mathsf{N}_d(\textbf{0}_d,\mathbf{I}_d). On the other hand, a result in Chernozhukov et al. (2023a) implies that the left-hand side above is zero if just d/nm/2βˆ’1βˆ’Ξ΅β†’0d/n^{m/2-1-\varepsilon}\to 0 for some Ρ∈(0,∞)\varepsilon\in(0,\infty). In this sense, there is a moment-dependent phase transition at the threshold d=nm/2βˆ’1d=n^{m/2-1} above which the limiting GAE jumps from zero to one.Comment: After uploading the first version to arXiv, we became aware of Zhang and Wu, (2017), Annals of Statistics. In their Remark 2, with the same method of proof, they established a result which is essentially identical to Equation 5 of our Theorem 2.1. We thank Moritz Jirak for pointing this out to us. This will be incorporated in the main text in the next version of the pape

    Regularizing Discrimination in Optimal Policy Learning with Distributional Targets

    Full text link
    A decision maker typically (i) incorporates training data to learn about the relative effectiveness of the treatments, and (ii) chooses an implementation mechanism that implies an "optimal" predicted outcome distribution according to some target functional. Nevertheless, a discrimination-aware decision maker may not be satisfied achieving said optimality at the cost of heavily discriminating against subgroups of the population, in the sense that the outcome distribution in a subgroup deviates strongly from the overall optimal outcome distribution. We study a framework that allows the decision maker to penalize for such deviations, while allowing for a wide range of target functionals and discrimination measures to be employed. We establish regret and consistency guarantees for empirical success policies with data-driven tuning parameters, and provide numerical results. Furthermore, we briefly illustrate the methods in two empirical settings
    corecore