64 research outputs found

    Dynamic Tensor Product Regression

    Full text link
    In this work, we initiate the study of \emph{Dynamic Tensor Product Regression}. One has matrices A1∈Rn1Γ—d1,…,Aq∈RnqΓ—dqA_1\in \mathbb{R}^{n_1\times d_1},\ldots,A_q\in \mathbb{R}^{n_q\times d_q} and a label vector b∈Rn1…nqb\in \mathbb{R}^{n_1\ldots n_q}, and the goal is to solve the regression problem with the design matrix AA being the tensor product of the matrices A1,A2,…,AqA_1, A_2, \dots, A_q i.e. min⁑x∈Rd1…dqΒ βˆ₯(A1βŠ—β€¦βŠ—Aq)xβˆ’bβˆ₯2\min_{x\in \mathbb{R}^{d_1\ldots d_q}}~\|(A_1\otimes \ldots\otimes A_q)x-b\|_2. At each time step, one matrix AiA_i receives a sparse change, and the goal is to maintain a sketch of the tensor product A1βŠ—β€¦βŠ—AqA_1\otimes\ldots \otimes A_q so that the regression solution can be updated quickly. Recomputing the solution from scratch for each round is very slow and so it is important to develop algorithms which can quickly update the solution with the new design matrix. Our main result is a dynamic tree data structure where any update to a single matrix can be propagated quickly throughout the tree. We show that our data structure can be used to solve dynamic versions of not only Tensor Product Regression, but also Tensor Product Spline regression (which is a generalization of ridge regression) and for maintaining Low Rank Approximations for the tensor product.Comment: NeurIPS 202

    Quantized Fourier and Polynomial Features for more Expressive Tensor Network Models

    Full text link
    In the context of kernel machines, polynomial and Fourier features are commonly used to provide a nonlinear extension to linear models by mapping the data to a higher-dimensional space. Unless one considers the dual formulation of the learning problem, which renders exact large-scale learning unfeasible, the exponential increase of model parameters in the dimensionality of the data caused by their tensor-product structure prohibits to tackle high-dimensional problems. One of the possible approaches to circumvent this exponential scaling is to exploit the tensor structure present in the features by constraining the model weights to be an underparametrized tensor network. In this paper we quantize, i.e. further tensorize, polynomial and Fourier features. Based on this feature quantization we propose to quantize the associated model weights, yielding quantized models. We show that, for the same number of model parameters, the resulting quantized models have a higher bound on the VC-dimension as opposed to their non-quantized counterparts, at no additional computational cost while learning from identical features. We verify experimentally how this additional tensorization regularizes the learning problem by prioritizing the most salient features in the data and how it provides models with increased generalization capabilities. We finally benchmark our approach on large regression task, achieving state-of-the-art results on a laptop computer

    Streaming Semidefinite Programs: O(n)O(\sqrt{n}) Passes, Small Space and Fast Runtime

    Full text link
    We study the problem of solving semidefinite programs (SDP) in the streaming model. Specifically, mm constraint matrices and a target matrix CC, all of size nΓ—nn\times n together with a vector b∈Rmb\in \mathbb{R}^m are streamed to us one-by-one. The goal is to find a matrix X∈RnΓ—nX\in \mathbb{R}^{n\times n} such that ⟨C,X⟩\langle C, X\rangle is maximized, subject to ⟨Ai,X⟩=bi\langle A_i, X\rangle=b_i for all i∈[m]i\in [m] and Xβͺ°0X\succeq 0. Previous algorithmic studies of SDP primarily focus on \emph{time-efficiency}, and all of them require a prohibitively large Ξ©(mn2)\Omega(mn^2) space in order to store \emph{all the constraints}. Such space consumption is necessary for fast algorithms as it is the size of the input. In this work, we design an interior point method (IPM) that uses O~(m2+n2)\widetilde O(m^2+n^2) space, which is strictly sublinear in the regime n≫mn\gg m. Our algorithm takes O(nlog⁑(1/Ο΅))O(\sqrt n\log(1/\epsilon)) passes, which is standard for IPM. Moreover, when mm is much smaller than nn, our algorithm also matches the time complexity of the state-of-the-art SDP solvers. To achieve such a sublinear space bound, we design a novel sketching method that enables one to compute a spectral approximation to the Hessian matrix in O(m2)O(m^2) space. To the best of our knowledge, this is the first method that successfully applies sketching technique to improve SDP algorithm in terms of space (also time)
    • …
    corecore