43 research outputs found
Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives
Part 2 of this monograph builds on the introduction to tensor networks and
their operations presented in Part 1. It focuses on tensor network models for
super-compressed higher-order representation of data/parameters and related
cost functions, while providing an outline of their applications in machine
learning and data analytics. A particular emphasis is on the tensor train (TT)
and Hierarchical Tucker (HT) decompositions, and their physically meaningful
interpretations which reflect the scalability of the tensor network approach.
Through a graphical approach, we also elucidate how, by virtue of the
underlying low-rank tensor approximations and sophisticated contractions of
core tensors, tensor networks have the ability to perform distributed
computations on otherwise prohibitively large volumes of data/parameters,
thereby alleviating or even eliminating the curse of dimensionality. The
usefulness of this concept is illustrated over a number of applied areas,
including generalized regression and classification (support tensor machines,
canonical correlation analysis, higher order partial least squares),
generalized eigenvalue decomposition, Riemannian optimization, and in the
optimization of deep neural networks. Part 1 and Part 2 of this work can be
used either as stand-alone separate texts, or indeed as a conjoint
comprehensive review of the exciting field of low-rank tensor networks and
tensor decompositions.Comment: 232 page
Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives
Part 2 of this monograph builds on the introduction to tensor networks and
their operations presented in Part 1. It focuses on tensor network models for
super-compressed higher-order representation of data/parameters and related
cost functions, while providing an outline of their applications in machine
learning and data analytics. A particular emphasis is on the tensor train (TT)
and Hierarchical Tucker (HT) decompositions, and their physically meaningful
interpretations which reflect the scalability of the tensor network approach.
Through a graphical approach, we also elucidate how, by virtue of the
underlying low-rank tensor approximations and sophisticated contractions of
core tensors, tensor networks have the ability to perform distributed
computations on otherwise prohibitively large volumes of data/parameters,
thereby alleviating or even eliminating the curse of dimensionality. The
usefulness of this concept is illustrated over a number of applied areas,
including generalized regression and classification (support tensor machines,
canonical correlation analysis, higher order partial least squares),
generalized eigenvalue decomposition, Riemannian optimization, and in the
optimization of deep neural networks. Part 1 and Part 2 of this work can be
used either as stand-alone separate texts, or indeed as a conjoint
comprehensive review of the exciting field of low-rank tensor networks and
tensor decompositions.Comment: 232 page
Structured sparsity via optimal interpolation norms
We study norms that can be used as penalties in machine learning problems. In particular, we consider norms that are defined by an optimal interpolation problem and whose additional structure can be used to encourage specific characteristics, such as sparsity, in the solution to a learning problem. We first study a norm that is defined as an infimum of quadratics parameterized over a convex set. We show that this formulation includes the k-support norm for sparse vector learning, and its Moreau envelope, the box-norm. These extend naturally to spectral regularizers for matrices, and we introduce the spectral k-support norm and spectral box-norm. We study their properties and we apply the penalties to low rank matrix and multitask learning problems. We next introduce two generalizations of the k-support norm. The first of these is the (k, p)-support norm. In the matrix setting, the additional parameter p allows us to better learn the curvature of the spectrum of the underlying solution. A second application is to multilinear algebra. By considering the rank of its matricizations, we obtain a k-support norm that can be applied to learn a low rank tensor. For each of these norms we provide an optimization method to solve the underlying learning problem, and we present numerical experiments. Finally, we present a general framework for optimal interpolation norms. We focus on a specific formulation that involves an infimal convolution coupled with a linear operator, and which captures several of the penalties discussed in this thesis. Finally we introduce an algorithm to solve regularization problems with norms of this type, and we provide numerical experiments to illustrate the method
Advanced Multilinear Data Analysis and Sparse Representation Approaches and Their Applications
Multifactor analysis plays an important role in data analysis since most real-world datasets usually exist with a combination of numerous factors. These factors are usually not independent but interdependent together. Thus, it is a mistake if a method only considers one aspect of the input data while ignoring the others. Although widely used, Multilinear PCA (MPCA), one of the leading multilinear analysis methods, still suffers from three major drawbacks. Firstly, it is very sensitive to outliers and noise and unable to cope with missing values. Secondly, since MPCA deals with huge multidimensional datasets, it is usually computationally expensive. Finally, it loses original local geometry structures due to the averaging process. This thesis sheds new light on the tensor decomposition problem via the ideas of fast low-rank approximation in random projection and tensor completion in compressed sensing. We propose a novel approach called Compressed Submanifold Multifactor Analysis (CSMA) to solve the three problems mentioned above. Our approach is able to deal with the problem of missing values and outliers via our proposed novel sparse Higher-order Singular Value Decomposition approach, named HOSVD-L1 decomposition. The Random Projection method is used to obtain the fast low-rank approximation of a given multifactor dataset.
In addition, our method can preserve geometry of the original data.
In the second part of this thesis, we present a novel pattern classification approach named Sparse Class-dependent Feature Analysis (SCFA), to connect the advantages of sparse representation in an overcomplete dictionary, with a powerful nonlinear classifier. The classifier is based on the estimation of class-specific optimal filters, by solving an L1-norm optimization problem using the Alternating Direction Method of Multipliers. Our method as well as its Reproducing Kernel Hilbert Space (RKHS) version is tolerant to the presence of noise and other variations in an image. Our proposed methods achieve very high classification accuracies in face recognition on two challenging face databases, i.e. the CMU Pose, Illumination and Expression (PIE) database and the Extended YALE-B that exhibit pose and illumination variations; and the AR database that has occluded images. In addition, they also exhibit robustness on other evaluation modalities, such as object classification on the Caltech101 database. Our method outperforms state-of-the-art methods on all these databases and hence they show their applicability to general computer vision and pattern recognition problems