Search CORE

47 research outputs found

Rational Krylov for Stieltjes matrix functions: convergence and pole selection

Author: Massei Stefano
Robol Leonardo
Publication venue
Publication date: 30/07/2020
Field of study

Evaluating the action of a matrix function on a vector, that is

x=f(\mathcal M)v

, is an ubiquitous task in applications. When

\mathcal M

is large, one usually relies on Krylov projection methods. In this paper, we provide effective choices for the poles of the rational Krylov method for approximating

x

when

f(z)

is either Cauchy-Stieltjes or Laplace-Stieltjes (or, which is equivalent, completely monotonic) and

\mathcal M

is a positive definite matrix. Relying on the same tools used to analyze the generic situation, we then focus on the case

\mathcal M=I \otimes A - B^T \otimes I

, and

v

obtained vectorizing a low-rank matrix; this finds application, for instance, in solving fractional diffusion equation on two-dimensional tensor grids. We see how to leverage tensorized Krylov subspaces to exploit the Kronecker structure and we introduce an error analysis for the numerical approximation of

x

. Pole selection strategies with explicit convergence bounds are given also in this case

arXiv.org e-Print Archive

Pure OAI Repository

A Galerkin Method for Large-scale Autonomous Differential Riccati Equations based on the Loewner Partial Order

Author: Behr M.
Publication venue: Otto-von-Guericke Universität
Publication date: 17/12/2021
Field of study

MPG.PuRe

Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

Author: Cichocki A.
Lee N.
Mandic D.
Oseledets I. V.
Phan A-H.
Sugiyama M.
Zhao Q.
Publication venue: 'Now Publishers'
Publication date: 01/01/2017
Field of study

Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.Comment: 232 page

arXiv.org e-Print Archive

Crossref

Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

Author: Cichocki A.
Phan A-H.
Zhao Q.
Lee N.
Oseledets I. V.
Sugiyama M.
Mandic D.
Publication venue
Publication date: 01/01/2017
Field of study

arXiv.org e-Print Archive

Crossref

FigShare

An inverse-free ADI algorithm for computing Lagrangian invariant subspaces

Author: Mehrmann Volker
POLONI FEDERICO GIOVANNI
Publication venue: 'Wiley'
Publication date: 01/01/2016
Field of study

Summary: The numerical computation of Lagrangian invariant subspaces of large-scale Hamiltonian matrices is discussed in the context of the solution of Lyapunov equations. A new version of the low-rank alternating direction implicit method is introduced, which, in order to avoid numerical difficulties with solutions that are of very large norm, uses an inverse-free representation of the subspace and avoids inverses of ill-conditioned matrices. It is shown that this prevents large growth of the elements of the solution that may destroy a low-rank approximation of the solution. A partial error analysis is presented, and the behavior of the method is demonstrated via several numerical examples. Copyrigh

Archivio della Ricerca - Università di Pisa

Rational Krylov for Stieltjes matrix functions: convergence and pole selection

Author: Massei S.
Robol L.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Evaluating the action of a matrix function on a vector, that is x= f(M) v, is an ubiquitous task in applications. When M is large, one usually relies on Krylov projection methods. In this paper, we provide effective choices for the poles of the rational Krylov method for approximating x when f(z) is either Cauchy–Stieltjes or Laplace–Stieltjes (or, which is equivalent, completely monotonic) and M is a positive definite matrix. Relying on the same tools used to analyze the generic situation, we then focus on the case M= I⊗ A- BT⊗ I, and v obtained vectorizing a low-rank matrix; this finds application, for instance, in solving fractional diffusion equation on two-dimensional tensor grids. We see how to leverage tensorized Krylov subspaces to exploit the Kronecker structure and we introduce an error analysis for the numerical approximation of x. Pole selection strategies with explicit convergence bounds are given also in this case

Archivio della Ricerca - Università di Pisa

Faster Algorithms for Structured Linear and Kernel Support Vector Machines

Author: Gu Yuzhou
Song Zhao
Zhang Lichen
Publication venue
Publication date: 13/11/2023
Field of study

Quadratic programming is a ubiquitous prototype in convex programming. Many combinatorial optimizations on graphs and machine learning problems can be formulated as quadratic programming; for example, Support Vector Machines (SVMs). Linear and kernel SVMs have been among the most popular models in machine learning over the past three decades, prior to the deep learning era. Generally, a quadratic program has an input size of

\Theta(n^2)

, where

n

is the number of variables. Assuming the Strong Exponential Time Hypothesis (

\textsf{SETH}

), it is known that no

O(n^{2-o(1)})

algorithm exists (Backurs, Indyk, and Schmidt, NIPS'17). However, problems such as SVMs usually feature much smaller input sizes: one is given

n

data points, each of dimension

d

, with

d \ll n

. Furthermore, SVMs are variants with only

O(1)

linear constraints. This suggests that faster algorithms are feasible, provided the program exhibits certain underlying structures. In this work, we design the first nearly-linear time algorithm for solving quadratic programs whenever the quadratic objective has small treewidth or admits a low-rank factorization, and the number of linear constraints is small. Consequently, we obtain a variety of results for SVMs: * For linear SVM, where the quadratic constraint matrix has treewidth

\tau

, we can solve the corresponding program in time

\widetilde O(n\tau^{(\omega+1)/2}\log(1/\epsilon))

; * For linear SVM, where the quadratic constraint matrix admits a low-rank factorization of rank-

k

, we can solve the corresponding program in time

\widetilde O(nk^{(\omega+1)/2}\log(1/\epsilon))

; * For Gaussian kernel SVM, where the data dimension

d = \Theta(\log n)

and the squared dataset radius is small, we can solve it in time

O(n^{1+o(1)}\log(1/\epsilon))

. We also prove that when the squared dataset radius is large, then

\Omega(n^{2-o(1)})

time is required.Comment: New results: almost-linear time algorithm for Gaussian kernel SVM and complementary lower bounds. Abstract shortened to meet arxiv requiremen

arXiv.org e-Print Archive

Efficient projection space updates for the approximation of iterative solutions to linear systems with successive right hand sides

Author: Christensen Nicholas
Publication venue
Publication date: 01/12/2017
Field of study

Accurate initial guesses to the solution can dramatically speed convergence of iterative solvers. In the case of successive right hand sides, it has been shown that accurate initial solutions may be obtained by projecting the newest right hand side vector onto a column space of recent prior solutions. We propose a technique to efficiently update the column space of prior solutions. We find this technique can modestly improve solver performance, though its potential is likely limited by the problem step size and the accuracy of the solver

Illinois Digital Environment for Access to Learning and Scholarship Repository

Optimization-Based Parametric Model Order Reduction via H<sub>2</sub> ⊗ L<sub>2</sub> First-Order Necessary Conditions

Author: Hund M.
Mitchell T.
Mlinarić P.
Saak J.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2022
Field of study

MPG.PuRe

Randomised preconditioning for the forcing formulation of weak constraint 4D‐Var

Author: Daužickaitė Ieva
Lawless Amos S.
Scott Jennifer A.
Van Leeuwen Peter Jan
Publication venue: 'Wiley'
Publication date: 01/01/2021
Field of study

There is growing awareness that errors in the model equations cannot be ignored in data assimilation methods such as four-dimensional variational assimilation (4D-Var). If allowed for, more information can be extracted from observations, longer time windows are possible, and the minimisation process is easier, at least in principle. Weak constraint 4D-Var estimates the model error and minimises a series of linear least-squares cost functions, which can be achieved using the conjugate gradient (CG) method; minimising each cost function is called an inner loop. CG needs preconditioning to improve its performance. In previous work, limited memory preconditioners (LMPs) have been constructed using approximations of the eigenvalues and eigenvectors of the Hessian in the previous inner loop. If the Hessian changes significantly in consecutive inner loops, the LMP may be of limited usefulness. To circumvent this, we propose using randomised methods for low rank eigenvalue decomposition and use these approximations to cheaply construct LMPs using information from the current inner loop. Three randomised methods are compared. Numerical experiments in idealized systems show that the resulting LMPs perform better than the existing LMPs. Using these methods may allow more efficient and robust implementations of incremental weak constraint 4D-Var

arXiv.org e-Print Archive

Central Archive at the University of Reading

ePubs: the open archive for STFC research publications