40 research outputs found
Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives
Part 2 of this monograph builds on the introduction to tensor networks and
their operations presented in Part 1. It focuses on tensor network models for
super-compressed higher-order representation of data/parameters and related
cost functions, while providing an outline of their applications in machine
learning and data analytics. A particular emphasis is on the tensor train (TT)
and Hierarchical Tucker (HT) decompositions, and their physically meaningful
interpretations which reflect the scalability of the tensor network approach.
Through a graphical approach, we also elucidate how, by virtue of the
underlying low-rank tensor approximations and sophisticated contractions of
core tensors, tensor networks have the ability to perform distributed
computations on otherwise prohibitively large volumes of data/parameters,
thereby alleviating or even eliminating the curse of dimensionality. The
usefulness of this concept is illustrated over a number of applied areas,
including generalized regression and classification (support tensor machines,
canonical correlation analysis, higher order partial least squares),
generalized eigenvalue decomposition, Riemannian optimization, and in the
optimization of deep neural networks. Part 1 and Part 2 of this work can be
used either as stand-alone separate texts, or indeed as a conjoint
comprehensive review of the exciting field of low-rank tensor networks and
tensor decompositions.Comment: 232 page
Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives
Part 2 of this monograph builds on the introduction to tensor networks and
their operations presented in Part 1. It focuses on tensor network models for
super-compressed higher-order representation of data/parameters and related
cost functions, while providing an outline of their applications in machine
learning and data analytics. A particular emphasis is on the tensor train (TT)
and Hierarchical Tucker (HT) decompositions, and their physically meaningful
interpretations which reflect the scalability of the tensor network approach.
Through a graphical approach, we also elucidate how, by virtue of the
underlying low-rank tensor approximations and sophisticated contractions of
core tensors, tensor networks have the ability to perform distributed
computations on otherwise prohibitively large volumes of data/parameters,
thereby alleviating or even eliminating the curse of dimensionality. The
usefulness of this concept is illustrated over a number of applied areas,
including generalized regression and classification (support tensor machines,
canonical correlation analysis, higher order partial least squares),
generalized eigenvalue decomposition, Riemannian optimization, and in the
optimization of deep neural networks. Part 1 and Part 2 of this work can be
used either as stand-alone separate texts, or indeed as a conjoint
comprehensive review of the exciting field of low-rank tensor networks and
tensor decompositions.Comment: 232 page
Bayesian Robust Tensor Ring Model for Incomplete Multiway Data
Robust tensor completion (RTC) aims to recover a low-rank tensor from its
incomplete observation with outlier corruption. The recently proposed tensor
ring (TR) model has demonstrated superiority in solving the RTC problem.
However, the existing methods either require a pre-assigned TR rank or
aggressively pursue the minimum TR rank, thereby often leading to biased
solutions in the presence of noise. In this paper, a Bayesian robust tensor
ring decomposition (BRTR) method is proposed to give more accurate solutions to
the RTC problem, which can avoid exquisite selection of the TR rank and penalty
parameters. A variational Bayesian (VB) algorithm is developed to infer the
probability distribution of posteriors. During the learning process, BRTR can
prune off slices of core tensor with marginal components, resulting in
automatic TR rank detection. Extensive experiments show that BRTR can achieve
significantly improved performance than other state-of-the-art methods
Low-rank estimation and embedding learning: theory and applications
In many real-world applications of data mining, datasets can be represented using matrices, where rows of the matrix correspond to objects (or data instances) and columns to features (or attributes). Often the datasets are in high-dimensional feature space. For example, in the vector space model of text data, the feature dimension is the vocabulary size. If representing a social network using an adjacency matrix, the feature dimension corresponds to the number of objects in the network. Many other datasets also fall into this category, such as genetic datasets, images, and medical datasets. Even though the feature dimension is enormous, a common observation is that the high-dimensional datasets may (approximately) lie in a subspace of smaller dimensionality, due to dependency or correlation among features. This thesis studies the problem of automatically identifying the low-dimensional space that high-dimensional datasets (approximately) lie in based on dimension reduction models: one is low-rank estimation models and the other is embedding learning models. For data matrices, low-rank estimation is to recover an underlying data matrix, subject to the constraint the matrix is of reduced rank. Such analysis is also generalized to the high-dimensional higher-order tensor data. Meanwhile, embedding learning models are to directly project the observation data into a low-dimensional vector space.
In the first part, the theoretical analysis of low-rank estimation models is established in the regime of high-dimensional statistics. For matrices, the low-rank structure corresponds to the sparsity of the singular values; while for tensors, the low-rank model can be defined as the low-rankness of the unfolding matrices of the tensor. To achieve low-rank solutions, two categories of regularization are imposed. Firstly, the problem of robust tensor decomposition with gross corruption is considered. To recover the underlying true tensor and corruption of large magnitude, structure assumptions of low-rankness and sparsity are imposed on the tensor and corruption, respectively. The Schatten-1 norm is applied as convex regularization for the low-rank structure. Secondly, the problem of matrix estimation is considered with a nonconvex penalty. Compared with convex regularization, nonconvex penalty takes advantage of the large singular values, which leads to faster statistical convergence rate and oracle property under a mild condition on the magnitude of the singular values. For both problems, efficient optimization algorithms are proposed, and extensive numerical experiments are conducted to corroborate the efficacy of the proposed algorithms and the theoretical analysis.
In the second part, embedding learning models for real-world applications are presented. The high-dimensional data is projected into a low-dimensional vector space via preserving the proximity among objects. Each object is represented by a low-dimensional vector, called embedding or distributed representation. In the first application, the heterogeneity of the objects is considered. Based on the observation that several interactions among the strongly-typed objects happen simultaneously as an event, the embeddings of objects in each event are learned as a whole. In other words, the model preserves the proximity among all the participating objects in each event. Experimental results provide evidence that the learned embeddings are more effective while being robust to data sparsity and noises for various classification tasks. In the second application, the task of expert finding is studied, which is to rank candidates with appropriate expertise based on a given query. To capture the subtle semantic information regarding specific queries with narrow semantic meanings, locally-trained embedding learning with concept hierarchy as guidance is proposed for query expansion. The locally-trained embeddings preserve the proximity among terms constrained on a sub-corpus. Compared with global embedding trained on the whole dataset, locally-trained embedding has stronger representation power. Experimental results show that the proposed embedding learning method achieves high precision regarding the task of expert finding.
To summarize, this thesis provides important results of low-rank estimation and embedding learning models for high-dimensional data analysis and real-world applications