5,616 research outputs found
Learning Efficient Tensor Representations with Ring Structure Networks
Tensor train (TT) decomposition is a powerful representation for high-order
tensors, which has been successfully applied to various machine learning tasks
in recent years. However, since the tensor product is not commutative,
permutation of data dimensions makes solutions and TT-ranks of TT decomposition
inconsistent. To alleviate this problem, we propose a permutation symmetric
network structure by employing circular multilinear products over a sequence of
low-order core tensors. This network structure can be graphically interpreted
as a cyclic interconnection of tensors, and thus we call it tensor ring (TR)
representation. We develop several efficient algorithms to learn TR
representation with adaptive TR-ranks by employing low-rank approximations.
Furthermore, mathematical properties are investigated, which enables us to
perform basic operations in a computationally efficiently way by using TR
representations. Experimental results on synthetic signals and real-world
datasets demonstrate that the proposed TR network is more expressive and
consistently informative than existing TT networks.Comment: arXiv admin note: substantial text overlap with arXiv:1606.0553
Towards Efficient Large-Scale Graph Neural Network Computing
Recent deep learning models have moved beyond low-dimensional regular grids
such as image, video, and speech, to high-dimensional graph-structured data,
such as social networks, brain connections, and knowledge graphs. This
evolution has led to large graph-based irregular and sparse models that go
beyond what existing deep learning frameworks are designed for. Further, these
models are not easily amenable to efficient, at scale, acceleration on parallel
hardwares (e.g. GPUs). We introduce NGra, the first parallel processing
framework for graph-based deep neural networks (GNNs). NGra presents a new
SAGA-NN model for expressing deep neural networks as vertex programs with each
layer in well-defined (Scatter, ApplyEdge, Gather, ApplyVertex) graph operation
stages. This model not only allows GNNs to be expressed intuitively, but also
facilitates the mapping to an efficient dataflow representation. NGra addresses
the scalability challenge transparently through automatic graph partitioning
and chunk-based stream processing out of GPU core or over multiple GPUs, which
carefully considers data locality, data movement, and overlapping of parallel
processing and data movement. NGra further achieves efficiency through highly
optimized Scatter/Gather operators on GPUs despite its sparsity. Our evaluation
shows that NGra scales to large real graphs that none of the existing
frameworks can handle directly, while achieving up to about 4 times speedup
even at small scales over the multiple-baseline design on TensorFlow
Tensor Ring Decomposition
Tensor networks have in recent years emerged as the powerful tools for
solving the large-scale optimization problems. One of the most popular tensor
network is tensor train (TT) decomposition that acts as the building blocks for
the complicated tensor networks. However, the TT decomposition highly depends
on permutations of tensor dimensions, due to its strictly sequential
multilinear products over latent cores, which leads to difficulties in finding
the optimal TT representation. In this paper, we introduce a fundamental tensor
decomposition model to represent a large dimensional tensor by a circular
multilinear products over a sequence of low dimensional cores, which can be
graphically interpreted as a cyclic interconnection of 3rd-order tensors, and
thus termed as tensor ring (TR) decomposition. The key advantage of TR model is
the circular dimensional permutation invariance which is gained by employing
the trace operation and treating the latent cores equivalently. TR model can be
viewed as a linear combination of TT decompositions, thus obtaining the
powerful and generalized representation abilities. For optimization of latent
cores, we present four different algorithms based on the sequential SVDs, ALS
scheme, and block-wise ALS techniques. Furthermore, the mathematical properties
of TR model are investigated, which shows that the basic multilinear algebra
can be performed efficiently by using TR representaions and the classical
tensor decompositions can be conveniently transformed into the TR
representation. Finally, the experiments on both synthetic signals and
real-world datasets were conducted to evaluate the performance of different
algorithms
Higher-dimension Tensor Completion via Low-rank Tensor Ring Decomposition
The problem of incomplete data is common in signal processing and machine
learning. Tensor completion algorithms aim to recover the incomplete data from
its partially observed entries. In this paper, taking advantages of high
compressibility and flexibility of recently proposed tensor ring (TR)
decomposition, we propose a new tensor completion approach named tensor ring
weighted optimization (TR-WOPT). It finds the latent factors of the incomplete
tensor by gradient descent algorithm, then the latent factors are employed to
predict the missing entries of the tensor. We conduct various tensor completion
experiments on synthetic data and real-world data. The simulation results show
that TR-WOPT performs well in various high-dimension tensors. Furthermore,
image completion results show that our proposed algorithm outperforms the
state-of-the-art algorithms in many situations. Especially when the missing
rate of the test images is high (e.g., over 0.9), the performance of our
TR-WOPT is significantly better than the compared algorithms.Comment: APSIPA2018 conference paper. arXiv admin note: substantial text
overlap with arXiv:1805.0846
Tensor Ring Decomposition with Rank Minimization on Latent Space: An Efficient Approach for Tensor Completion
In tensor completion tasks, the traditional low-rank tensor decomposition
models suffer from the laborious model selection problem due to their high
model sensitivity. In particular, for tensor ring (TR) decomposition, the
number of model possibilities grows exponentially with the tensor order, which
makes it rather challenging to find the optimal TR decomposition. In this
paper, by exploiting the low-rank structure of the TR latent space, we propose
a novel tensor completion method which is robust to model selection. In
contrast to imposing the low-rank constraint on the data space, we introduce
nuclear norm regularization on the latent TR factors, resulting in the
optimization step using singular value decomposition (SVD) being performed at a
much smaller scale. By leveraging the alternating direction method of
multipliers (ADMM) scheme, the latent TR factors with optimal rank and the
recovered tensor can be obtained simultaneously. Our proposed algorithm is
shown to effectively alleviate the burden of TR-rank selection, thereby greatly
reducing the computational cost. The extensive experimental results on both
synthetic and real-world data demonstrate the superior performance and
efficiency of the proposed approach against the state-of-the-art algorithms
Provably Powerful Graph Networks
Recently, the Weisfeiler-Lehman (WL) graph isomorphism test was used to
measure the expressive power of graph neural networks (GNN). It was shown that
the popular message passing GNN cannot distinguish between graphs that are
indistinguishable by the 1-WL test (Morris et al. 2018; Xu et al. 2019).
Unfortunately, many simple instances of graphs are indistinguishable by the
1-WL test.
In search for more expressive graph learning models we build upon the recent
k-order invariant and equivariant graph neural networks (Maron et al. 2019a,b)
and present two results:
First, we show that such k-order networks can distinguish between
non-isomorphic graphs as good as the k-WL tests, which are provably stronger
than the 1-WL test for k>2. This makes these models strictly stronger than
message passing models. Unfortunately, the higher expressiveness of these
models comes with a computational cost of processing high order tensors.
Second, setting our goal at building a provably stronger, simple and scalable
model we show that a reduced 2-order network containing just scaled identity
operator, augmented with a single quadratic operation (matrix multiplication)
has a provable 3-WL expressive power. Differently put, we suggest a simple
model that interleaves applications of standard Multilayer-Perceptron (MLP)
applied to the feature dimension and matrix multiplication. We validate this
model by presenting state of the art results on popular graph classification
and regression tasks. To the best of our knowledge, this is the first practical
invariant/equivariant model with guaranteed 3-WL expressiveness, strictly
stronger than message passing models
Compressing Recurrent Neural Networks with Tensor Ring for Action Recognition
Recurrent Neural Networks (RNNs) and their variants, such as Long-Short Term
Memory (LSTM) networks, and Gated Recurrent Unit (GRU) networks, have achieved
promising performance in sequential data modeling. The hidden layers in RNNs
can be regarded as the memory units, which are helpful in storing information
in sequential contexts. However, when dealing with high dimensional input data,
such as video and text, the input-to-hidden linear transformation in RNNs
brings high memory usage and huge computational cost. This makes the training
of RNNs unscalable and difficult. To address this challenge, we propose a novel
compact LSTM model, named as TR-LSTM, by utilizing the low-rank tensor ring
decomposition (TRD) to reformulate the input-to-hidden transformation. Compared
with other tensor decomposition methods, TR-LSTM is more stable. In addition,
TR-LSTM can complete an end-to-end training and also provide a fundamental
building block for RNNs in handling large input data. Experiments on real-world
action recognition datasets have demonstrated the promising performance of the
proposed TR-LSTM compared with the tensor train LSTM and other state-of-the-art
competitors.Comment: 9 page
From probabilistic graphical models to generalized tensor networks for supervised learning
Tensor networks have found a wide use in a variety of applications in physics
and computer science, recently leading to both theoretical insights as well as
practical algorithms in machine learning. In this work we explore the
connection between tensor networks and probabilistic graphical models, and show
that it motivates the definition of generalized tensor networks where
information from a tensor can be copied and reused in other parts of the
network. We discuss the relationship between generalized tensor network
architectures used in quantum physics, such as string-bond states, and
architectures commonly used in machine learning. We provide an algorithm to
train these networks in a supervised-learning context and show that they
overcome the limitations of regular tensor networks in higher dimensions, while
keeping the computation efficient. A method to combine neural networks and
tensor networks as part of a common deep learning architecture is also
introduced. We benchmark our algorithm for several generalized tensor network
architectures on the task of classifying images and sounds, and show that they
outperform previously introduced tensor-network algorithms. The models we
consider also have a natural implementation on a quantum computer and may guide
the development of near-term quantum machine learning architectures.Comment: 15 pages, 18 figures, improved version with additional explanation
Tensor Decompositions for Modeling Inverse Dynamics
Modeling inverse dynamics is crucial for accurate feedforward robot control.
The model computes the necessary joint torques, to perform a desired movement.
The highly non-linear inverse function of the dynamical system can be
approximated using regression techniques. We propose as regression method a
tensor decomposition model that exploits the inherent three-way interaction of
positions x velocities x accelerations. Most work in tensor factorization has
addressed the decomposition of dense tensors. In this paper, we build upon the
decomposition of sparse tensors, with only small amounts of nonzero entries.
The decomposition of sparse tensors has successfully been used in relational
learning, e.g., the modeling of large knowledge graphs. Recently, the approach
has been extended to multi-class classification with discrete input variables.
Representing the data in high dimensional sparse tensors enables the
approximation of complex highly non-linear functions. In this paper we show how
the decomposition of sparse tensors can be applied to regression problems.
Furthermore, we extend the method to continuous inputs, by learning a mapping
from the continuous inputs to the latent representations of the tensor
decomposition, using basis functions. We evaluate our proposed model on a
dataset with trajectories from a seven degrees of freedom SARCOS robot arm. Our
experimental results show superior performance of the proposed functional
tensor model, compared to challenging state-of-the art methods
Multi-Branch Tensor Network Structure for Tensor-Train Discriminant Analysis
Higher-order data with high dimensionality arise in a diverse set of
application areas such as computer vision, video analytics and medical imaging.
Tensors provide a natural tool for representing these types of data. Although
there has been a lot of work in the area of tensor decomposition and low-rank
tensor approximation, extensions to supervised learning, feature extraction and
classification are still limited. Moreover, most of the existing supervised
tensor learning approaches are based on the orthogonal Tucker model. However,
this model has some limitations for large tensors including high memory and
computational costs. In this paper, we introduce a supervised learning approach
for tensor classification based on the tensor-train model. In particular, we
introduce a multi-branch tensor network structure for efficient implementation
of tensor-train discriminant analysis (TTDA). The proposed approach takes
advantage of the flexibility of the tensor train structure to implement various
computationally efficient versions of TTDA. This approach is then evaluated on
image and video classification tasks with respect to computation time, storage
cost and classification accuracy and is compared to both vector and tensor
based discriminant analysis methods
- …