1,195 research outputs found

    Exponential Machines

    Full text link
    Modeling interactions between features improves the performance of machine learning solutions in many domains (e.g. recommender systems or sentiment analysis). In this paper, we introduce Exponential Machines (ExM), a predictor that models all interactions of every order. The key idea is to represent an exponentially large tensor of parameters in a factorized format called Tensor Train (TT). The Tensor Train format regularizes the model and lets you control the number of underlying parameters. To train the model, we develop a stochastic Riemannian optimization procedure, which allows us to fit tensors with 2^160 entries. We show that the model achieves state-of-the-art performance on synthetic data with high-order interactions and that it works on par with high-order factorization machines on a recommender system dataset MovieLens 100K.Comment: ICLR-2017 workshop track pape

    On computing high-dimensional Riemann theta functions

    Get PDF
    Riemann theta functions play a crucial role in the field of nonlinear Fourier analysis, where they are used to realize inverse nonlinear Fourier transforms for periodic signals. The practical applicability of this approach has however been limited since Riemann theta functions are multi-dimensional Fourier series whose computation suffers from the curse of dimensionality. In this paper, we investigate several new approaches to compute Riemann theta functions with the goal of unlocking their practical potential. Our first contributions are novel theoretical lower and upper bounds on the series truncation error. These bounds allow us to rule out several of the existing approaches for the high-dimension regime. We then propose to consider low-rank tensor and hyperbolic cross based techniques. We first examine a tensor-train based algorithm which utilizes the popular scaling and squaring approach. We show theoretically that this approach cannot break the curse of dimensionality. Finally, we investigate two other tensor-train based methods numerically and compare them to hyperbolic cross based methods. Using finite-genus solutions of the Korteweg–de Vries (KdV) and nonlinear Schrödinger equation (NLS) equations, we demonstrate the accuracy of the proposed algorithms. The tensor-train based algorithms are shown to work well for low genus solutions with real arguments but are limited by memory for higher genera. The hyperbolic cross based algorithm also achieves high accuracy for low genus solutions. Its novelty is the ability to feasibly compute moderately accurate solutions (a relative error of magnitude 0.01) for high dimensions (up to 60). It therefore enables the computation of complex inverse nonlinear Fourier transforms that were so far out of reach

    A literature survey of low-rank tensor approximation techniques

    Full text link
    During the last years, low-rank tensor approximation has been established as a new tool in scientific computing to address large-scale linear and multilinear algebra problems, which would be intractable by classical techniques. This survey attempts to give a literature overview of current developments in this area, with an emphasis on function-related tensors

    Nearest-Neighbor Interaction Systems in the Tensor-Train Format

    Get PDF
    Low-rank tensor approximation approaches have become an important tool in the scientific computing community. The aim is to enable the simulation and analysis of high-dimensional problems which cannot be solved using conventional methods anymore due to the so-called curse of dimensionality. This requires techniques to handle linear operators defined on extremely large state spaces and to solve the resulting systems of linear equations or eigenvalue problems. In this paper, we present a systematic tensor-train decomposition for nearest-neighbor interaction systems which is applicable to a host of different problems. With the aid of this decomposition, it is possible to reduce the memory consumption as well as the computational costs significantly. Furthermore, it can be shown that in some cases the rank of the tensor decomposition does not depend on the network size. The format is thus feasible even for high-dimensional systems. We will illustrate the results with several guiding examples such as the Ising model, a system of coupled oscillators, and a CO oxidation model

    Long-term Forecasting using Tensor-Train RNNs

    Get PDF
    We present Tensor-Train RNN (TT-RNN), a novel family of neural sequence architectures for multivariate forecasting in environments with nonlinear dynamics. Long-term forecasting in such systems is highly challenging, since there exist long-term temporal dependencies, higher-order correlations and sensitivity to error propagation. Our proposed tensor recurrent architecture addresses these issues by learning the nonlinear dynamics directly using higher order moments and high-order state transition functions. Furthermore, we decompose the higher-order structure using the tensor-train (TT) decomposition to reduce the number of parameters while preserving the model performance. We theoretically establish the approximation properties of Tensor-Train RNNs for general sequence inputs, and such guarantees are not available for usual RNNs. We also demonstrate significant long-term prediction improvements over general RNN and LSTM architectures on a range of simulated environments with nonlinear dynamics, as well on real-world climate and traffic data

    Quantized Fourier and Polynomial Features for more Expressive Tensor Network Models

    Full text link
    In the context of kernel machines, polynomial and Fourier features are commonly used to provide a nonlinear extension to linear models by mapping the data to a higher-dimensional space. Unless one considers the dual formulation of the learning problem, which renders exact large-scale learning unfeasible, the exponential increase of model parameters in the dimensionality of the data caused by their tensor-product structure prohibits to tackle high-dimensional problems. One of the possible approaches to circumvent this exponential scaling is to exploit the tensor structure present in the features by constraining the model weights to be an underparametrized tensor network. In this paper we quantize, i.e. further tensorize, polynomial and Fourier features. Based on this feature quantization we propose to quantize the associated model weights, yielding quantized models. We show that, for the same number of model parameters, the resulting quantized models have a higher bound on the VC-dimension as opposed to their non-quantized counterparts, at no additional computational cost while learning from identical features. We verify experimentally how this additional tensorization regularizes the learning problem by prioritizing the most salient features in the data and how it provides models with increased generalization capabilities. We finally benchmark our approach on large regression task, achieving state-of-the-art results on a laptop computer
    corecore