6,611 research outputs found

    Subsampling Algorithms for Semidefinite Programming

    Full text link
    We derive a stochastic gradient algorithm for semidefinite optimization using randomization techniques. The algorithm uses subsampling to reduce the computational cost of each iteration and the subsampling ratio explicitly controls granularity, i.e. the tradeoff between cost per iteration and total number of iterations. Furthermore, the total computational cost is directly proportional to the complexity (i.e. rank) of the solution. We study numerical performance on some large-scale problems arising in statistical learning.Comment: Final version, to appear in Stochastic System

    Verified partial eigenvalue computations using contour integrals for Hermitian generalized eigenproblems

    Full text link
    We propose a verified computation method for partial eigenvalues of a Hermitian generalized eigenproblem. The block Sakurai-Sugiura Hankel method, a contour integral-type eigensolver, can reduce a given eigenproblem into a generalized eigenproblem of block Hankel matrices whose entries consist of complex moments. In this study, we evaluate all errors in computing the complex moments. We derive a truncation error bound of the quadrature. Then, we take numerical errors of the quadrature into account and rigorously enclose the entries of the block Hankel matrices. Each quadrature point gives rise to a linear system, and its structure enables us to develop an efficient technique to verify the approximate solution. Numerical experiments show that the proposed method outperforms a standard method and infer that the proposed method is potentially efficient in parallel.Comment: 15 pages, 4 figures, 1 tabl

    On the Sample Complexity of Subspace Learning

    Full text link
    A large number of algorithms in machine learning, from principal component analysis (PCA), and its non-linear (kernel) extensions, to more recent spectral embedding and support estimation methods, rely on estimating a linear subspace from samples. In this paper we introduce a general formulation of this problem and derive novel learning error estimates. Our results rely on natural assumptions on the spectral properties of the covariance operator associated to the data distribu- tion, and hold for a wide class of metrics between subspaces. As special cases, we discuss sharp error estimates for the reconstruction properties of PCA and spectral support estimation. Key to our analysis is an operator theoretic approach that has broad applicability to spectral learning methods.Comment: Extendend Version of conference pape

    Tail bounds for all eigenvalues of a sum of random matrices

    Get PDF
    This work introduces the minimax Laplace transform method, a modification of the cumulant-based matrix Laplace transform method developed in "User-friendly tail bounds for sums of random matrices" (arXiv:1004.4389v6) that yields both upper and lower bounds on each eigenvalue of a sum of random self-adjoint matrices. This machinery is used to derive eigenvalue analogues of the classical Chernoff, Bennett, and Bernstein bounds. Two examples demonstrate the efficacy of the minimax Laplace transform. The first concerns the effects of column sparsification on the spectrum of a matrix with orthonormal rows. Here, the behavior of the singular values can be described in terms of coherence-like quantities. The second example addresses the question of relative accuracy in the estimation of eigenvalues of the covariance matrix of a random process. Standard results on the convergence of sample covariance matrices provide bounds on the number of samples needed to obtain relative accuracy in the spectral norm, but these results only guarantee relative accuracy in the estimate of the maximum eigenvalue. The minimax Laplace transform argument establishes that if the lowest eigenvalues decay sufficiently fast, on the order of (K^2*r*log(p))/eps^2 samples, where K is the condition number of an optimal rank-r approximation to C, are sufficient to ensure that the dominant r eigenvalues of the covariance matrix of a N(0, C) random vector are estimated to within a factor of 1+-eps with high probability.Comment: 20 pages, 1 figure, see also arXiv:1004.4389v

    Linearly Convergent First-Order Algorithms for Semi-definite Programming

    Full text link
    In this paper, we consider two formulations for Linear Matrix Inequalities (LMIs) under Slater type constraint qualification assumption, namely, SDP smooth and non-smooth formulations. We also propose two first-order linearly convergent algorithms for solving these formulations. Moreover, we introduce a bundle-level method which converges linearly uniformly for both smooth and non-smooth problems and does not require any smoothness information. The convergence properties of these algorithms are also discussed. Finally, we consider a special case of LMIs, linear system of inequalities, and show that a linearly convergent algorithm can be obtained under a weaker assumption

    Distributed Detection over Fading MACs with Multiple Antennas at the Fusion Center

    Full text link
    A distributed detection problem over fading Gaussian multiple-access channels is considered. Sensors observe a phenomenon and transmit their observations to a fusion center using the amplify and forward scheme. The fusion center has multiple antennas with different channel models considered between the sensors and the fusion center, and different cases of channel state information are assumed at the sensors. The performance is evaluated in terms of the error exponent for each of these cases, where the effect of multiple antennas at the fusion center is studied. It is shown that for zero-mean channels between the sensors and the fusion center when there is no channel information at the sensors, arbitrarily large gains in the error exponent can be obtained with sufficient increase in the number of antennas at the fusion center. In stark contrast, when there is channel information at the sensors, the gain in error exponent due to having multiple antennas at the fusion center is shown to be no more than a factor of (8/pi) for Rayleigh fading channels between the sensors and the fusion center, independent of the number of antennas at the fusion center, or correlation among noise samples across sensors. Scaling laws for such gains are also provided when both sensors and antennas are increased simultaneously. Simple practical schemes and a numerical method using semidefinite relaxation techniques are presented that utilize the limited possible gains available. Simulations are used to establish the accuracy of the results.Comment: 21 pages, 9 figures, submitted to the IEEE Transactions on Signal Processin
    • …
    corecore