15 research outputs found
Convolutional Analysis Operator Learning: Dependence on Training Data
Convolutional analysis operator learning (CAOL) enables the unsupervised
training of (hierarchical) convolutional sparsifying operators or autoencoders
from large datasets. One can use many training images for CAOL, but a precise
understanding of the impact of doing so has remained an open question. This
paper presents a series of results that lend insight into the impact of dataset
size on the filter update in CAOL. The first result is a general deterministic
bound on errors in the estimated filters, and is followed by a bound on the
expected errors as the number of training samples increases. The second result
provides a high probability analogue. The bounds depend on properties of the
training data, and we investigate their empirical values with real data. Taken
together, these results provide evidence for the potential benefit of using
more training data in CAOL.Comment: 5 pages, 2 figure
The effect of perturbations of linear operators on their polar decomposition
The effect of matrix perturbations on the polar decomposition has been
studied by several authors and various results are known. However, for
operators between infinite-dimensional spaces the problem has not been
considered so far. Here, we prove in particular that the partial isometry in
the polar decomposition of an operator is stable under perturbations, given
that kernel and range of original and perturbed operator satisfy a certain
condition. In the matrix case, this condition is weaker than the usually
imposed equal-rank condition. It includes the case of semi-Fredholm operators
with agreeing nullities and deficiencies, respectively. In addition, we prove a
similar perturbation result where the ranges or the kernels of the two
operators are assumed to be sufficiently close to each other in the gap metric.Comment: 13 page
Lossless digraph signal processing via polar decomposition
In this paper, we present a signal processing framework for directed graphs.
Unlike undirected graphs, a graph shift operator such as the adjacency matrix
associated with a directed graph usually does not admit an orthogonal
eigenbasis. This makes it challenging to define the Fourier transform. Our
methodology leverages the polar decomposition to define two distinct
eigendecompositions, each associated with different matrices derived from this
decomposition. We propose to extend the frequency domain and introduce a
Fourier transform that jointly encodes the spectral response of a signal for
the two eigenbases from the polar decomposition. This allows us to define
convolution following a standard routine. Our approach has two features: it is
lossless as the shift operator can be fully recovered from factors of the polar
decomposition. Moreover, it subsumes the traditional graph signal processing if
the graph is directed. We present numerical results to show how the framework
can be applied
Generalized Orthogonal Procrustes Problem under Arbitrary Adversaries
The generalized orthogonal Procrustes problem (GOPP) plays a fundamental role
in several scientific disciplines including statistics, imaging science and
computer vision. Despite its tremendous practical importance, it is generally
an NP-hard problem to find the least squares estimator. We study the
semidefinite relaxation (SDR) and an iterative method named generalized power
method (GPM) to find the least squares estimator, and investigate the
performance under a signal-plus-noise model. We show that the SDR recovers the
least squares estimator exactly and moreover the generalized power method with
a proper initialization converges linearly to the global minimizer to the SDR,
provided that the signal-to-noise ratio is large. The main technique follows
from showing the nonlinear mapping involved in the GPM is essentially a local
contraction mapping and then applying the well-known Banach fixed-point theorem
finishes the proof. In addition, we analyze the low-rank factorization
algorithm and show the corresponding optimization landscape is free of spurious
local minimizers under nearly identical conditions that enables the success of
SDR approach. The highlight of our work is that the theoretical guarantees are
purely algebraic and do not assume any statistical priors of the additive
adversaries, and thus it applies to various interesting settings.Comment: The first draft was posted in 2021; this version of manuscript has
gone through a significant revision. The proof has been completely re-written
and shortened to make it more readabl
A new scaling for Newton's iteration for the polar decomposition and its backward stability
This is the published version, also available here: http://dx.doi.org/10.1137/070699895.We propose a scaling scheme for Newton's iteration for calculating the polar decomposition. The scaling factors are generated by a simple scalar iteration in which the initial value depends only on estimates of the extreme singular values of the original matrix, which can, for example, be the Frobenius norms of the matrix and its inverse. In exact arithmetic, for matrices with condition number no greater than , with this scaling scheme no more than 9 iterations are needed for convergence to the unitary polar factor with a convergence tolerance roughly equal to . It is proved that if matrix inverses computed in finite precision arithmetic satisfy a backward-forward error model, then the numerical method is backward stable. It is also proved that Newton's method with Higham's scaling or with Frobenius norm scaling is backward stable
Stability of polar decompositions
Certain continuity properties of the factors in generalized polar decompositions of real and complex matrices are studied. A complete characterization is given of those generalized polar decompositions that persist under small perturbations in the matrix and in the scalar product. Connections are made with quadratic matrix equations, and with stability properties of certain invariant subspaces
A Cheeger Inequality for the Graph Connection Laplacian
The O(d) Synchronization problem consists of estimating a set of unknown
orthogonal transformations O_i from noisy measurements of a subset of the
pairwise ratios O_iO_j^{-1}. We formulate and prove a Cheeger-type inequality
that relates a measure of how well it is possible to solve the O(d)
synchronization problem with the spectra of an operator, the graph Connection
Laplacian. We also show how this inequality provides a worst case performance
guarantee for a spectral method to solve this problem.Comment: To appear in the SIAM Journal on Matrix Analysis and Applications
(SIMAX
How and why to solve the operator equation AX-XB= Y
The entities A, B, X, Y in the title are operators, by which we mean either linear transformations on a finite-dimensional vector space (matrices) or bounded (= continuous) linear transformations on a Banach space. (All scalars will be complex numbers.) The definitions and statements below are valid in both the finite-dimensional and the infinite-dimensional cases, unless the contrary is stated
Decentralized Complete Dictionary Learning via -Norm Maximization
With the rapid development of information technologies, centralized data
processing is subject to many limitations, such as computational overheads,
communication delays, and data privacy leakage. Decentralized data processing
over networked terminal nodes becomes an important technology in the era of big
data. Dictionary learning is a powerful representation learning method to
exploit the low-dimensional structure from the high-dimensional data. By
exploiting the low-dimensional structure, the storage and the processing
overhead of data can be effectively reduced. In this paper, we propose a novel
decentralized complete dictionary learning algorithm, which is based on
-norm maximization. Compared with existing decentralized dictionary
learning algorithms, comprehensive numerical experiments show that the novel
algorithm has significant advantages in terms of per-iteration computational
complexity, communication cost, and convergence rate in many scenarios.
Moreover, a rigorous theoretical analysis shows that the dictionaries learned
by the proposed algorithm can converge to the one learned by a centralized
dictionary learning algorithm at a linear rate with high probability under
certain conditions