418 research outputs found

    Stochastic Preconditioning for Diagonally Dominant Matrices

    Full text link

    A fast algorithm for matrix balancing

    Get PDF
    As long as a square nonnegative matrix A contains sufficient nonzero elements, then the matrix can be balanced, that is we can find a diagonal scaling of A that is doubly stochastic. A number of algorithms have been proposed to achieve the balancing, the most well known of these being Sinkhorn-Knopp. In this paper we derive new algorithms based on inner-outer iteration schemes. We show that Sinkhorn-Knopp belongs to this family, but other members can converge much more quickly. In particular, we show that while stationary iterative methods offer little or no improvement in many cases, a scheme using a preconditioned conjugate gradient method as the inner iteration can give quadratic convergence at low cost

    Estimating the inverse trace using random forests on graphs

    Get PDF
    Some data analysis problems require the computation of (regularised) inverse traces, i.e. quantities of the form \Tr (q \bI + \bL)^{-1}. For large matrices, direct methods are unfeasible and one must resort to approximations, for example using a conjugate gradient solver combined with Girard's trace estimator (also known as Hutchinson's trace estimator). Here we describe an unbiased estimator of the regularized inverse trace, based on Wilson's algorithm, an algorithm that was initially designed to draw uniform spanning trees in graphs. Our method is fast, easy to implement, and scales to very large matrices. Its main drawback is that it is limited to diagonally dominant matrices \bL.Comment: Submitted to GRETSI conferenc

    Diagonal Preconditioning: Theory and Algorithms

    Full text link
    Diagonal preconditioning has been a staple technique in optimization and machine learning. It often reduces the condition number of the design or Hessian matrix it is applied to, thereby speeding up convergence. However, rigorous analyses of how well various diagonal preconditioning procedures improve the condition number of the preconditioned matrix and how that translates into improvements in optimization are rare. In this paper, we first provide an analysis of a popular diagonal preconditioning technique based on column standard deviation and its effect on the condition number using random matrix theory. Then we identify a class of design matrices whose condition numbers can be reduced significantly by this procedure. We then study the problem of optimal diagonal preconditioning to improve the condition number of any full-rank matrix and provide a bisection algorithm and a potential reduction algorithm with O(log(1ϵ))O(\log(\frac{1}{\epsilon})) iteration complexity, where each iteration consists of an SDP feasibility problem and a Newton update using the Nesterov-Todd direction, respectively. Finally, we extend the optimal diagonal preconditioning algorithm to an adaptive setting and compare its empirical performance at reducing the condition number and speeding up convergence for regression and classification problems with that of another adaptive preconditioning technique, namely batch normalization, that is essential in training machine learning models.Comment: Under review, previous version wrong draf
    corecore