Search CORE

691 research outputs found

Optimal Calibration for Multiple Testing against Local Inhomogeneity in Higher Dimension

Author: Rohde Angelika
Publication venue
Publication date: 11/08/2009
Field of study

Based on two independent samples X_1,...,X_m and X_{m+1},...,X_n drawn from multivariate distributions with unknown Lebesgue densities p and q respectively, we propose an exact multiple test in order to identify simultaneously regions of significant deviations between p and q. The construction is built from randomized nearest-neighbor statistics. It does not require any preliminary information about the multivariate densities such as compact support, strict positivity or smoothness and shape properties. The properly adjusted multiple testing procedure is shown to be sharp-optimal for typical arrangements of the observation values which appear with probability close to one. The proof relies on a new coupling Bernstein type exponential inequality, reflecting the non-subgaussian tail behavior of a combinatorial process. For power investigation of the proposed method a reparametrized minimax set-up is introduced, reducing the composite hypothesis "p=q" to a simple one with the multivariate mixed density (m/n)p+(1-m/n)q as infinite dimensional nuisance parameter. Within this framework, the test is shown to be spatially and sharply asymptotically adaptive with respect to uniform loss on isotropic H\"older classes. The exact minimax risk asymptotics are obtained in terms of solutions of the optimal recovery

arXiv.org e-Print Archive

CiteSeerX

Publications Server of the Weierstrass Institute for Applied Analysis and Stochastics

Recommended from our members

Smoothness assumptions in human and machine vision, and their implications for optimal surface interpolation

Author: Boult Terrance E.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1986
Field of study

In this paper we shall examine what smoothness assumptions are made about object surfaces, object motion, and image intensities. We begin by looking into the physiological limits of vision and how these might influence our perception of smoothness. We then look at a sampling of the computer vision and psychology literature, inferring smoothness constraints from the mathematical assumptions tacitly presumed by researchers. This look at computer vision and psychology of vision is not meant to be an inclusive study, but rather representative of the assumptions made, and in part representative of the mathematical models used therein. We shall conclude that prevalent assumptions are that surfaces, motion, and intensity images are functions in C2, eland c2 respectively. In the latter portion of this paper we examine one use of explicit assumptions on smoothness in the definition of existing method for obtaining "optimal" surface interpolation. We briefly introduce the nomenclature of information-based complexity originated by Traub, Wozniakowski, and their colleagues, which is the mathematical machinery used in obtaining these "optimal" surfaces. This theory requires that we know the class of functions from which our desired surface comes, and part of the definition of a class is the degree of smoothness. We then survey many possible classes for the visual interpolation problem of two dimensional surfaces, and state formulas from which one can obtain the optimal surface interpolating given depth data

Columbia University Academic Commons

Fast global convergence of gradient methods for high-dimensional statistical recovery

Author: Agarwal Alekh
Negahban Sahand N.
Wainwright Martin J.
Publication venue
Publication date: 01/01/2011
Field of study

Many statistical

M

-estimators are based on convex optimization problems formed by the combination of a data-dependent loss function with a norm-based regularizer. We analyze the convergence rates of projected gradient and composite gradient methods for solving such problems, working within a high-dimensional framework that allows the data dimension \pdim to grow with (and possibly exceed) the sample size \numobs. This high-dimensional structure precludes the usual global assumptions---namely, strong convexity and smoothness conditions---that underlie much of classical optimization analysis. We define appropriately restricted versions of these conditions, and show that they are satisfied with high probability for various statistical models. Under these conditions, our theory guarantees that projected gradient descent has a globally geometric rate of convergence up to the \emph{statistical precision} of the model, meaning the typical distance between the true unknown parameter

\theta^*

and an optimal solution

\hat{\theta}

. This result is substantially sharper than previous convergence results, which yielded sublinear convergence, or linear convergence only up to the noise level. Our analysis applies to a wide range of

M

-estimators and statistical models, including sparse linear regression using Lasso (

\ell_1

-regularized regression); group Lasso for block sparsity; log-linear models with regularization; low-rank matrix recovery using nuclear norm regularization; and matrix decomposition. Overall, our analysis reveals interesting connections between statistical precision and computational efficiency in high-dimensional estimation

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Crossref

Finite-sample Analysis of M-estimators using Self-concordance

Author: Bach Francis
Ostrovskii Dmitrii
Publication venue
Publication date: 16/10/2018
Field of study

We demonstrate how self-concordance of the loss can be exploited to obtain asymptotically optimal rates for M-estimators in finite-sample regimes. We consider two classes of losses: (i) canonically self-concordant losses in the sense of Nesterov and Nemirovski (1994), i.e., with the third derivative bounded with the

3/2

power of the second; (ii) pseudo self-concordant losses, for which the power is removed, as introduced by Bach (2010). These classes contain some losses arising in generalized linear models, including logistic regression; in addition, the second class includes some common pseudo-Huber losses. Our results consist in establishing the critical sample size sufficient to reach the asymptotically optimal excess risk for both classes of losses. Denoting

d

the parameter dimension, and

d_{\text{eff}}

the effective dimension which takes into account possible model misspecification, we find the critical sample size to be

O(d_{\text{eff}} \cdot d)

for canonically self-concordant losses, and

O(\rho \cdot d_{\text{eff}} \cdot d)

for pseudo self-concordant losses, where

\rho

is the problem-dependent local curvature parameter. In contrast to the existing results, we only impose local assumptions on the data distribution, assuming that the calibrated design, i.e., the design scaled with the square root of the second derivative of the loss, is subgaussian at the best predictor

\theta_*

. Moreover, we obtain the improved bounds on the critical sample size, scaling near-linearly in

\max(d_{\text{eff}},d)

, under the extra assumption that the calibrated design is subgaussian in the Dikin ellipsoid of

\theta_*

. Motivated by these findings, we construct canonically self-concordant analogues of the Huber and logistic losses with improved statistical properties. Finally, we extend some of these results to

\ell_1

-regularized M-estimators in high dimensions

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL-Rennes 1

Optimization with Sparsity-Inducing Penalties

Author: Bach Francis
Jenatton Rodolphe
Mairal Julien
Obozinski Guillaume
Publication venue
Publication date: 01/01/2011
Field of study

Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. They were first dedicated to linear variable selection but numerous extensions have now emerged such as structured sparsity or kernel selection. It turns out that many of the related estimation problems can be cast as convex optimization problems by regularizing the empirical risk with appropriate non-smooth norms. The goal of this paper is to present from a general perspective optimization tools and techniques dedicated to such sparsity-inducing penalties. We cover proximal methods, block-coordinate descent, reweighted

\ell_2

-penalized techniques, working-set and homotopy methods, as well as non-convex formulations and extensions, and provide an extensive set of experiments to compare various algorithms from a computational point of view

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

HAL-Rennes 1