374 research outputs found
Convolutional Dictionary Learning through Tensor Factorization
Tensor methods have emerged as a powerful paradigm for consistent learning of
many latent variable models such as topic models, independent component
analysis and dictionary learning. Model parameters are estimated via CP
decomposition of the observed higher order input moments. However, in many
domains, additional invariances such as shift invariances exist, enforced via
models such as convolutional dictionary learning. In this paper, we develop
novel tensor decomposition algorithms for parameter estimation of convolutional
models. Our algorithm is based on the popular alternating least squares method,
but with efficient projections onto the space of stacked circulant matrices.
Our method is embarrassingly parallel and consists of simple operations such as
fast Fourier transforms and matrix multiplications. Our algorithm converges to
the dictionary much faster and more accurately compared to the alternating
minimization over filters and activation maps
Recycling Randomness with Structure for Sublinear time Kernel Expansions
We propose a scheme for recycling Gaussian random vectors into structured
matrices to approximate various kernel functions in sublinear time via random
embeddings. Our framework includes the Fastfood construction as a special case,
but also extends to Circulant, Toeplitz and Hankel matrices, and the broader
family of structured matrices that are characterized by the concept of
low-displacement rank. We introduce notions of coherence and graph-theoretic
structural constants that control the approximation quality, and prove
unbiasedness and low-variance properties of random feature maps that arise
within our framework. For the case of low-displacement matrices, we show how
the degree of structure and randomness can be controlled to reduce statistical
variance at the cost of increased computation and storage requirements.
Empirical results strongly support our theory and justify the use of a broader
family of structured matrices for scaling up kernel methods using random
features
- …