418 research outputs found
A fast algorithm for matrix balancing
As long as a square nonnegative matrix A contains sufficient nonzero elements, then the matrix can be balanced, that is we can find a diagonal scaling of A that is doubly stochastic. A number of algorithms have been proposed to achieve the balancing, the most well known of these being Sinkhorn-Knopp. In this paper we derive new algorithms based on inner-outer iteration schemes. We show that Sinkhorn-Knopp belongs to this family, but other members can converge much more quickly. In particular, we show that while stationary iterative methods offer little or no improvement in many cases, a scheme using a preconditioned conjugate gradient method as the inner iteration can give quadratic convergence at low cost
Estimating the inverse trace using random forests on graphs
Some data analysis problems require the computation of (regularised) inverse
traces, i.e. quantities of the form \Tr (q \bI + \bL)^{-1}. For large
matrices, direct methods are unfeasible and one must resort to approximations,
for example using a conjugate gradient solver combined with Girard's trace
estimator (also known as Hutchinson's trace estimator). Here we describe an
unbiased estimator of the regularized inverse trace, based on Wilson's
algorithm, an algorithm that was initially designed to draw uniform spanning
trees in graphs. Our method is fast, easy to implement, and scales to very
large matrices. Its main drawback is that it is limited to diagonally dominant
matrices \bL.Comment: Submitted to GRETSI conferenc
Diagonal Preconditioning: Theory and Algorithms
Diagonal preconditioning has been a staple technique in optimization and
machine learning. It often reduces the condition number of the design or
Hessian matrix it is applied to, thereby speeding up convergence. However,
rigorous analyses of how well various diagonal preconditioning procedures
improve the condition number of the preconditioned matrix and how that
translates into improvements in optimization are rare. In this paper, we first
provide an analysis of a popular diagonal preconditioning technique based on
column standard deviation and its effect on the condition number using random
matrix theory. Then we identify a class of design matrices whose condition
numbers can be reduced significantly by this procedure. We then study the
problem of optimal diagonal preconditioning to improve the condition number of
any full-rank matrix and provide a bisection algorithm and a potential
reduction algorithm with iteration complexity,
where each iteration consists of an SDP feasibility problem and a Newton update
using the Nesterov-Todd direction, respectively. Finally, we extend the optimal
diagonal preconditioning algorithm to an adaptive setting and compare its
empirical performance at reducing the condition number and speeding up
convergence for regression and classification problems with that of another
adaptive preconditioning technique, namely batch normalization, that is
essential in training machine learning models.Comment: Under review, previous version wrong draf
- …