1,195 research outputs found
Exponential Machines
Modeling interactions between features improves the performance of machine
learning solutions in many domains (e.g. recommender systems or sentiment
analysis). In this paper, we introduce Exponential Machines (ExM), a predictor
that models all interactions of every order. The key idea is to represent an
exponentially large tensor of parameters in a factorized format called Tensor
Train (TT). The Tensor Train format regularizes the model and lets you control
the number of underlying parameters. To train the model, we develop a
stochastic Riemannian optimization procedure, which allows us to fit tensors
with 2^160 entries. We show that the model achieves state-of-the-art
performance on synthetic data with high-order interactions and that it works on
par with high-order factorization machines on a recommender system dataset
MovieLens 100K.Comment: ICLR-2017 workshop track pape
On computing high-dimensional Riemann theta functions
Riemann theta functions play a crucial role in the field of nonlinear Fourier analysis, where they are used to realize inverse nonlinear Fourier transforms for periodic signals. The practical applicability of this approach has however been limited since Riemann theta functions are multi-dimensional Fourier series whose computation suffers from the curse of dimensionality. In this paper, we investigate several new approaches to compute Riemann theta functions with the goal of unlocking their practical potential. Our first contributions are novel theoretical lower and upper bounds on the series truncation error. These bounds allow us to rule out several of the existing approaches for the high-dimension regime. We then propose to consider low-rank tensor and hyperbolic cross based techniques. We first examine a tensor-train based algorithm which utilizes the popular scaling and squaring approach. We show theoretically that this approach cannot break the curse of dimensionality. Finally, we investigate two other tensor-train based methods numerically and compare them to hyperbolic cross based methods. Using finite-genus solutions of the Korteweg–de Vries (KdV) and nonlinear Schrödinger equation (NLS) equations, we demonstrate the accuracy of the proposed algorithms. The tensor-train based algorithms are shown to work well for low genus solutions with real arguments but are limited by memory for higher genera. The hyperbolic cross based algorithm also achieves high accuracy for low genus solutions. Its novelty is the ability to feasibly compute moderately accurate solutions (a relative error of magnitude 0.01) for high dimensions (up to 60). It therefore enables the computation of complex inverse nonlinear Fourier transforms that were so far out of reach
A literature survey of low-rank tensor approximation techniques
During the last years, low-rank tensor approximation has been established as
a new tool in scientific computing to address large-scale linear and
multilinear algebra problems, which would be intractable by classical
techniques. This survey attempts to give a literature overview of current
developments in this area, with an emphasis on function-related tensors
Nearest-Neighbor Interaction Systems in the Tensor-Train Format
Low-rank tensor approximation approaches have become an important tool in the
scientific computing community. The aim is to enable the simulation and
analysis of high-dimensional problems which cannot be solved using conventional
methods anymore due to the so-called curse of dimensionality. This requires
techniques to handle linear operators defined on extremely large state spaces
and to solve the resulting systems of linear equations or eigenvalue problems.
In this paper, we present a systematic tensor-train decomposition for
nearest-neighbor interaction systems which is applicable to a host of different
problems. With the aid of this decomposition, it is possible to reduce the
memory consumption as well as the computational costs significantly.
Furthermore, it can be shown that in some cases the rank of the tensor
decomposition does not depend on the network size. The format is thus feasible
even for high-dimensional systems. We will illustrate the results with several
guiding examples such as the Ising model, a system of coupled oscillators, and
a CO oxidation model
Long-term Forecasting using Tensor-Train RNNs
We present Tensor-Train RNN (TT-RNN), a novel family of neural sequence architectures for multivariate forecasting in environments with nonlinear dynamics. Long-term forecasting in such systems is highly challenging, since there exist long-term temporal dependencies, higher-order correlations and sensitivity to error propagation. Our proposed tensor recurrent architecture addresses these issues by learning the nonlinear dynamics directly using higher order moments and high-order state transition functions. Furthermore, we decompose the higher-order structure using the tensor-train (TT) decomposition to reduce the number of parameters while preserving the model performance. We theoretically establish the approximation properties of Tensor-Train RNNs for general sequence inputs, and such guarantees are not available for usual RNNs. We also demonstrate significant long-term prediction improvements over general RNN and LSTM architectures on a range of simulated environments with nonlinear dynamics, as well on real-world climate and traffic data
Quantized Fourier and Polynomial Features for more Expressive Tensor Network Models
In the context of kernel machines, polynomial and Fourier features are
commonly used to provide a nonlinear extension to linear models by mapping the
data to a higher-dimensional space. Unless one considers the dual formulation
of the learning problem, which renders exact large-scale learning unfeasible,
the exponential increase of model parameters in the dimensionality of the data
caused by their tensor-product structure prohibits to tackle high-dimensional
problems. One of the possible approaches to circumvent this exponential scaling
is to exploit the tensor structure present in the features by constraining the
model weights to be an underparametrized tensor network. In this paper we
quantize, i.e. further tensorize, polynomial and Fourier features. Based on
this feature quantization we propose to quantize the associated model weights,
yielding quantized models. We show that, for the same number of model
parameters, the resulting quantized models have a higher bound on the
VC-dimension as opposed to their non-quantized counterparts, at no additional
computational cost while learning from identical features. We verify
experimentally how this additional tensorization regularizes the learning
problem by prioritizing the most salient features in the data and how it
provides models with increased generalization capabilities. We finally
benchmark our approach on large regression task, achieving state-of-the-art
results on a laptop computer
- …