22 research outputs found

    Detection of a sparse submatrix of a high-dimensional noisy matrix

    Full text link
    We observe a N×MN\times M matrix Yij=sij+ξijY_{ij}=s_{ij}+\xi_{ij} with ξijN(0,1)\xi_{ij}\sim {\mathcal {N}}(0,1) i.i.d. in i,ji,j, and sijRs_{ij}\in \mathbb {R}. We test the null hypothesis sij=0s_{ij}=0 for all i,ji,j against the alternative that there exists some submatrix of size n×mn\times m with significant elements in the sense that sija>0s_{ij}\ge a>0. We propose a test procedure and compute the asymptotical detection boundary aa so that the maximal testing risk tends to 0 as MM\to\infty, NN\to\infty, p=n/N0p=n/N\to0, q=m/M0q=m/M\to0. We prove that this boundary is asymptotically sharp minimax under some additional constraints. Relations with other testing problems are discussed. We propose a testing procedure which adapts to unknown (n,m)(n,m) within some given set and compute the adaptive sharp rates. The implementation of our test procedure on synthetic data shows excellent behavior for sparse, not necessarily squared matrices. We extend our sharp minimax results in different directions: first, to Gaussian matrices with unknown variance, next, to matrices of random variables having a distribution from an exponential family (non-Gaussian) and, finally, to a two-sided alternative for matrices with Gaussian elements.Comment: Published in at http://dx.doi.org/10.3150/12-BEJ470 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    Statistical inference in compound functional models

    Get PDF
    We consider a general nonparametric regression model called the compound model. It includes, as special cases, sparse additive regression and nonparametric (or linear) regression with many covariates but possibly a small number of relevant covariates. The compound model is characterized by three main parameters: the structure parameter describing the "macroscopic" form of the compound function, the "microscopic" sparsity parameter indicating the maximal number of relevant covariates in each component and the usual smoothness parameter corresponding to the complexity of the members of the compound. We find non-asymptotic minimax rate of convergence of estimators in such a model as a function of these three parameters. We also show that this rate can be attained in an adaptive way

    Adaptation in minimax nonparametric hypothesis testing for ellipsoids and Besov bodies

    Get PDF
    We observe an infinitely dimensional Gaussian random vector x=ξ+vx=\xi+v where ξ\xi is a sequence of standard Gaussian variables and vl2v\in l_2 is an unknown mean. Let Vε(τ,ρε)l2V_{\varepsilon}(\tau,\rho_{\varepsilon})\subset l_2 be sets which correspond to lql_q-ellipsoids %of the radiuses R/εR/{\varepsilon} of power semi-axes ai=isR/εa_i=i^{-s}R/{\varepsilon} with lpl_p-ellipsoid %of the radiuses ρε/ε\rho_{\varepsilon}/\varepsilon and of semi-axes bi=irρε/εb_i=i^{-r}\rho_{\varepsilon}/\varepsilon removed or to similar Besov bodies Bq,t,s(R/ε)B_{q,t,s}(R/{\varepsilon}) with Besov bodies Bp,h,r(ρε/ε)B_{p,h,r}(\rho_{\varepsilon}/{\varepsilon}) removed. Here τ=(κ,R)\tau =(\kappa,R) or τ=(κ,h,t,R),  κ=(p,q,r,s)\tau =(\kappa,h,t,R),\ \ \kappa=(p,q,r,s) are the parameters which define the sets VεV_{\varepsilon} for given radiuses ρε0\rho_{\varepsilon}\to 0, 00, ε000,\ \varepsilon\to 0 is asymptotical parameter. For the case τ\tau is known hypothesis testing problem H0:v=0H_0: v=0 versus alternatives Hε,τ:vVε(τ,ρε)H_{\varepsilon,\tau}:v\in V_{\varepsilon}(\tau, \rho_{\varepsilon}) have been considered by Ingster and Suslina [11] in minimax setting. It was shown that there is a partition of the set of κ\kappa on to regions with different types of asymptotics: classical, trivial, degenerate and Gaussian (of two main and some "boundary" types). Also there is essential dependence of the structure of asymptotically minimax tests on the parameter κ\kappa for the case of Gaussian asymptotics . In this paper we consider alternative Hε,Γ:vVε(Γ)H_{\varepsilon,\Gamma}:v\in V_\varepsilon(\Gamma) for sets Vε(Γ)=τΓVε(τ,ρε(τ)){V_{\varepsilon}(\Gamma)= \bigcup_{\tau\in \Gamma}V_{\varepsilon}(\tau,\rho_{\varepsilon}(\tau))}. This corresponds to adaptive setting: τ\tau is unknown, τΓ\tau\in \Gamma for a compact Γ=K×Δ, Δ=[c, C]R+1, KΞG1ΞG2\Gamma=K\times \Delta,\ \Delta=[c,\ C]\subset R_+^1,\ K\subset \Xi_{G_1}\cup \Xi_{G_2} where ΞG2 \Xi_{G_2} and ΞG2 \Xi_{G_2} are regions of main tapes of Gaussian asymptotics . First the problems of such types were studied by Spokoiny [16, 17]. For ellipsoidal case we study sharp asymptotics of minimax second kind errors βε(α,Γ)=β(α,Vε(Γ))\beta_{\varepsilon}(\alpha, \Gamma)=\beta(\alpha, V_{\varepsilon}(\Gamma)) and construct asymptotically minimax tests. % ψα,ε,Γ\psi_{\alpha,\varepsilon,\Gamma}. These asymptotics are analogous to degenerate type. For Besov bodies case we obtain exact rates and construct minimax consistent tests. Analogous exact rates are obtained in a signal detection problem for continuous variant of white Gaussian noise model: alternatives correspond to Besov or Sobolev balls with Sobolev or Besov balls removed. The study is based on results [11] and on an extension of methods of this paper for degenerate case

    Adaptive detection of high-dimensional signal

    Get PDF
    Let n-dimensional Gaussian random vector x = ξ + v be observed where ξ is a standard n-dimensional Gaussian vector and v ∈ Rn is the unknown mean. In the papers [3,5] there were studied minimax hypothesis testing problems: to test null - hypothesis H0 : v = 0 against two types of alternatives H1 = H1(θn): v ∈ Vn(θn). The first one corresponds to multi-channels signal detection problem for given value b of a signal and number k of channels containing a signal, θn = (b,k). The second one corresponds to lnq-ball of radius R1,n with the lnp-ball of radius R2,n removed, θn = (R1,n, R2,n,p,q) ∈ R4+. It was shown in [3,5] that often there are essential dependences of the structure of asymptotically minimax tests and of the asymptotics of the minimax second kind errors on parameters θn. These imply the problem: to construct adaptive tests having good minimax property for large enough regions Θn of parameters θn. This problem is studied here. We describe the sets Θn such that adaptation is possible without loss of efficiency. For other sets we present wide enough class of asymptotically exact bounds of adaptive efficiency and construct asymptotically minimax test procedures

    Sparse classification boundaries

    Get PDF
    Given a training sample of size mm from a dd-dimensional population, we wish to allocate a new observation ZRdZ\in \R^d to this population or to the noise. We suppose that the difference between the distribution of the population and that of the noise is only in a shift, which is a sparse vector. For the Gaussian noise, fixed sample size mm, and the dimension dd that tends to infinity, we obtain the sharp classification boundary and we propose classifiers attaining this boundary. We also give extensions of this result to the case where the sample size mm depends on dd and satisfies the condition (logm)/logdγ(\log m)/\log d \to \gamma, 0γ<10\le \gamma<1, and to the case of non-Gaussian noise satisfying the Cram\'er condition

    Minimax signal detection in ill-posed inverse problems

    Full text link
    Ill-posed inverse problems arise in various scientific fields. We consider the signal detection problem for mildly, severely and extremely ill-posed inverse problems with lql^q-ellipsoids (bodies), q(0,2]q\in(0,2], for Sobolev, analytic and generalized analytic classes of functions under the Gaussian white noise model. We study both rate and sharp asymptotics for the error probabilities in the minimax setup. By construction, the derived tests are, often, nonadaptive. Minimax rate-optimal adaptive tests of rather simple structure are also constructed.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1011 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Detection boundary in sparse regression

    Get PDF
    We study the problem of detection of a p-dimensional sparse vector of parameters in the linear regression model with Gaussian noise. We establish the detection boundary, i.e., the necessary and sufficient conditions for the possibility of successful detection as both the sample size n and the dimension p tend to the infinity. Testing procedures that achieve this boundary are also exhibited. Our results encompass the high-dimensional setting (p>> n). The main message is that, under some conditions, the detection boundary phenomenon that has been proved for the Gaussian sequence model, extends to high-dimensional linear regression. Finally, we establish the detection boundaries when the variance of the noise is unknown. Interestingly, the detection boundaries sometimes depend on the knowledge of the variance in a high-dimensional setting
    corecore