40 research outputs found
Learning Tensor Latent Features
We study the problem of learning latent feature models (LFMs) for tensor data
commonly observed in science and engineering such as hyperspectral imagery.
However, the problem is challenging not only due to the non-convex formulation,
the combinatorial nature of the constraints in LFMs, but also the high-order
correlations in the data. In this work, we formulate a tensor latent feature
learning problem by representing the data as a mixture of high-order latent
features and binary codes, which are memory efficient and easy to interpret. To
make the learning tractable, we propose a novel optimization procedure, Binary
matching pursuit (BMP), that iteratively searches for binary bases via a
MAXCUT-like boolean quadratic solver. Such a procedure is guaranteed to achieve
an? suboptimal solution in O() greedy steps, resulting in a
trade-off between accuracy and sparsity. When evaluated on both synthetic and
real datasets, our experiments show superior performance over baseline methods.Comment: 10 pages, 3 figure
Higher-Order Low-Rank Regression
This paper proposes an efficient algorithm (HOLRR) to handle regression tasks
where the outputs have a tensor structure. We formulate the regression problem
as the minimization of a least square criterion under a multilinear rank
constraint, a difficult non convex problem. HOLRR computes efficiently an
approximate solution of this problem, with solid theoretical guarantees. A
kernel extension is also presented. Experiments on synthetic and real data show
that HOLRR outperforms multivariate and multilinear regression methods and is
considerably faster than existing tensor methods.Comment: submitted to ICML 201
Tensor Regression Networks with various Low-Rank Tensor Approximations
Tensor regression networks achieve high compression rate of neural networks
while having slight impact on performances. They do so by imposing low tensor
rank structure on the weight matrices of fully connected layers. In recent
years, tensor regression networks have been investigated from the perspective
of their compressive power, however, the regularization effect of enforcing
low-rank tensor structure has not been investigated enough. We study tensor
regression networks using various low-rank tensor approximations, aiming to
compare the compressive and regularization power of different low-rank
constraints. We evaluate the compressive and regularization performances of the
proposed model with both deep and shallow convolutional neural networks. The
outcome of our experiment suggests the superiority of Global Average Pooling
Layer over Tensor Regression Layer when applied to deep convolutional neural
network with CIFAR-10 dataset. On the contrary, shallow convolutional neural
networks with tensor regression layer and dropout achieved lower test error
than both Global Average Pooling and fully-connected layer with dropout
function when trained with a small number of samples
Learning Depthwise Separable Graph Convolution from Data Manifold
Convolution Neural Network (CNN) has gained tremendous success in computer
vision tasks with its outstanding ability to capture the local latent features.
Recently, there has been an increasing interest in extending convolution
operations to the non-Euclidean geometry. Although various types of convolution
operations have been proposed for graphs or manifolds, their connections with
traditional convolution over grid-structured data are not well-understood. In
this paper, we show that depthwise separable convolution can be successfully
generalized for the unification of both graph-based and grid-based convolution
methods. Based on this insight we propose a novel Depthwise Separable Graph
Convolution (DSGC) approach which is compatible with the tradition convolution
network and subsumes existing convolution methods as special cases. It is
equipped with the combined strengths in model expressiveness, compatibility
(relatively small number of parameters), modularity and computational
efficiency in training. Extensive experiments show the outstanding performance
of DSGC in comparison with strong baselines on multi-domain benchmark datasets
Tensor Regression Meets Gaussian Processes
Low-rank tensor regression, a new model class that learns high-order
correlation from data, has recently received considerable attention. At the
same time, Gaussian processes (GP) are well-studied machine learning models for
structure learning. In this paper, we demonstrate interesting connections
between the two, especially for multi-way data analysis. We show that low-rank
tensor regression is essentially learning a multi-linear kernel in Gaussian
processes, and the low-rank assumption translates to the constrained Bayesian
inference problem. We prove the oracle inequality and derive the average case
learning curve for the equivalent GP model. Our finding implies that low-rank
tensor regression, though empirically successful, is highly dependent on the
eigenvalues of covariance functions as well as variable correlations.Comment: 17 page
FasTer: Fast Tensor Completion with Nonconvex Regularization
Low-rank tensor completion problem aims to recover a tensor from limited
observations, which has many real-world applications. Due to the easy
optimization, the convex overlapping nuclear norm has been popularly used for
tensor completion. However, it over-penalizes top singular values and lead to
biased estimations. In this paper, we propose to use the nonconvex regularizer,
which can less penalize large singular values, instead of the convex one for
tensor completion. However, as the new regularizer is nonconvex and overlapped
with each other, existing algorithms are either too slow or suffer from the
huge memory cost. To address these issues, we develop an efficient and scalable
algorithm, which is based on the proximal average (PA) algorithm, for
real-world problems. Compared with the direct usage of PA algorithm, the
proposed algorithm runs orders faster and needs orders less space. We further
speed up the proposed algorithm with the acceleration technique, and show the
convergence to critical points is still guaranteed. Experimental comparisons of
the proposed approach are made with various other tensor completion approaches.
Empirical results show that the proposed algorithm is very fast and can produce
much better recovery performance
Time-varying Autoregression with Low Rank Tensors
We present a windowed technique to learn parsimonious time-varying
autoregressive models from multivariate timeseries. This unsupervised method
uncovers interpretable spatiotemporal structure in data via non-smooth and
non-convex optimization. In each time window, we assume the data follow a
linear model parameterized by a system matrix, and we model this stack of
potentially different system matrices as a low rank tensor. Because of its
structure, the model is scalable to high-dimensional data and can easily
incorporate priors such as smoothness over time. We find the components of the
tensor using alternating minimization and prove that any stationary point of
this algorithm is a local minimum. We demonstrate on a synthetic example that
our method identifies the true rank of a switching linear system in the
presence of noise. We illustrate our model's utility and superior scalability
over extant methods when applied to several synthetic and real-world example:
two types of time-varying linear systems, worm behavior, sea surface
temperature, and monkey brain datasets
Multiresolution Tensor Learning for Efficient and Interpretable Spatial Analysis
Efficient and interpretable spatial analysis is crucial in many fields such as geology, sports, and climate science. Large-scale spatial data often contains complex higher-order correlations across features and locations. While tensor latent factor models can describe higher-order correlations, they are inherently computationally expensive to train. Furthermore, for spatial analysis, these models should not only be predictive but also be spatially coherent. However, latent factor models are sensitive to initialization and can yield inexplicable results. We develop a novel Multi-resolution Tensor Learning (MRTL) algorithm for efficiently learning interpretable spatial patterns. MRTL initializes the latent factors from an approximate full-rank tensor model for improved interpretability and progressively learns from a coarse resolution to the fine resolution for an enormous computation speedup. We also prove the theoretical convergence and computational complexity of MRTL. When applied to two real-world datasets, MRTL demonstrates 4 ~ 5 times speedup compared to a fixed resolution while yielding accurate and interpretable models
Theoretical and Experimental Analyses of Tensor-Based Regression and Classification
We theoretically and experimentally investigate tensor-based regression and
classification. Our focus is regularization with various tensor norms,
including the overlapped trace norm, the latent trace norm, and the scaled
latent trace norm. We first give dual optimization methods using the
alternating direction method of multipliers, which is computationally efficient
when the number of training samples is moderate. We then theoretically derive
an excess risk bound for each tensor norm and clarify their behavior. Finally,
we perform extensive experiments using simulated and real data and demonstrate
the superiority of tensor-based learning methods over vector- and matrix-based
learning methods
Long-Short Term Spatiotemporal Tensor Prediction for Passenger Flow Profile
Spatiotemporal data is very common in many applications, such as
manufacturing systems and transportation systems. It is typically difficult to
be accurately predicted given intrinsic complex spatial and temporal
correlations. Most of the existing methods based on various statistical models
and regularization terms, fail to preserve innate features in data alongside
their complex correlations. In this paper, we focus on a tensor-based
prediction and propose several practical techniques to improve prediction. For
long-term prediction specifically, we propose the "Tensor Decomposition +
2-Dimensional Auto-Regressive Moving Average (2D-ARMA)" model, and an effective
way to update prediction real-time; For short-term prediction, we propose to
conduct tensor completion based on tensor clustering to avoid oversimplifying
and ensure accuracy. A case study based on the metro passenger flow data is
conducted to demonstrate the improved performance