104 research outputs found
A Riemannian Trust Region Method for the Canonical Tensor Rank Approximation Problem
The canonical tensor rank approximation problem (TAP) consists of
approximating a real-valued tensor by one of low canonical rank, which is a
challenging non-linear, non-convex, constrained optimization problem, where the
constraint set forms a non-smooth semi-algebraic set. We introduce a Riemannian
Gauss-Newton method with trust region for solving small-scale, dense TAPs. The
novelty of our approach is threefold. First, we parametrize the constraint set
as the Cartesian product of Segre manifolds, hereby formulating the TAP as a
Riemannian optimization problem, and we argue why this parametrization is among
the theoretically best possible. Second, an original ST-HOSVD-based retraction
operator is proposed. Third, we introduce a hot restart mechanism that
efficiently detects when the optimization process is tending to an
ill-conditioned tensor rank decomposition and which often yields a quick escape
path from such spurious decompositions. Numerical experiments show improvements
of up to three orders of magnitude in terms of the expected time to compute a
successful solution over existing state-of-the-art methods
A literature survey of low-rank tensor approximation techniques
During the last years, low-rank tensor approximation has been established as
a new tool in scientific computing to address large-scale linear and
multilinear algebra problems, which would be intractable by classical
techniques. This survey attempts to give a literature overview of current
developments in this area, with an emphasis on function-related tensors
Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives
Part 2 of this monograph builds on the introduction to tensor networks and
their operations presented in Part 1. It focuses on tensor network models for
super-compressed higher-order representation of data/parameters and related
cost functions, while providing an outline of their applications in machine
learning and data analytics. A particular emphasis is on the tensor train (TT)
and Hierarchical Tucker (HT) decompositions, and their physically meaningful
interpretations which reflect the scalability of the tensor network approach.
Through a graphical approach, we also elucidate how, by virtue of the
underlying low-rank tensor approximations and sophisticated contractions of
core tensors, tensor networks have the ability to perform distributed
computations on otherwise prohibitively large volumes of data/parameters,
thereby alleviating or even eliminating the curse of dimensionality. The
usefulness of this concept is illustrated over a number of applied areas,
including generalized regression and classification (support tensor machines,
canonical correlation analysis, higher order partial least squares),
generalized eigenvalue decomposition, Riemannian optimization, and in the
optimization of deep neural networks. Part 1 and Part 2 of this work can be
used either as stand-alone separate texts, or indeed as a conjoint
comprehensive review of the exciting field of low-rank tensor networks and
tensor decompositions.Comment: 232 page
Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives
Part 2 of this monograph builds on the introduction to tensor networks and
their operations presented in Part 1. It focuses on tensor network models for
super-compressed higher-order representation of data/parameters and related
cost functions, while providing an outline of their applications in machine
learning and data analytics. A particular emphasis is on the tensor train (TT)
and Hierarchical Tucker (HT) decompositions, and their physically meaningful
interpretations which reflect the scalability of the tensor network approach.
Through a graphical approach, we also elucidate how, by virtue of the
underlying low-rank tensor approximations and sophisticated contractions of
core tensors, tensor networks have the ability to perform distributed
computations on otherwise prohibitively large volumes of data/parameters,
thereby alleviating or even eliminating the curse of dimensionality. The
usefulness of this concept is illustrated over a number of applied areas,
including generalized regression and classification (support tensor machines,
canonical correlation analysis, higher order partial least squares),
generalized eigenvalue decomposition, Riemannian optimization, and in the
optimization of deep neural networks. Part 1 and Part 2 of this work can be
used either as stand-alone separate texts, or indeed as a conjoint
comprehensive review of the exciting field of low-rank tensor networks and
tensor decompositions.Comment: 232 page
Preconditioned low-rank Riemannian optimization for linear systems with tensor product structure
The numerical solution of partial differential equations on high-dimensional
domains gives rise to computationally challenging linear systems. When using
standard discretization techniques, the size of the linear system grows
exponentially with the number of dimensions, making the use of classic
iterative solvers infeasible. During the last few years, low-rank tensor
approaches have been developed that allow to mitigate this curse of
dimensionality by exploiting the underlying structure of the linear operator.
In this work, we focus on tensors represented in the Tucker and tensor train
formats. We propose two preconditioned gradient methods on the corresponding
low-rank tensor manifolds: A Riemannian version of the preconditioned
Richardson method as well as an approximate Newton scheme based on the
Riemannian Hessian. For the latter, considerable attention is given to the
efficient solution of the resulting Newton equation. In numerical experiments,
we compare the efficiency of our Riemannian algorithms with other established
tensor-based approaches such as a truncated preconditioned Richardson method
and the alternating linear scheme. The results show that our approximate
Riemannian Newton scheme is significantly faster in cases when the application
of the linear operator is expensive.Comment: 24 pages, 8 figure
- …