Search CORE

2 research outputs found

Restricted Isometry Property under High Correlations

Author: Kasiviswanathan Shiva Prasad
Rudelson Mark
Publication venue
Publication date: 31/10/2019
Field of study

Matrices satisfying the Restricted Isometry Property (RIP) play an important role in the areas of compressed sensing and statistical learning. RIP matrices with optimal parameters are mainly obtained via probabilistic arguments, as explicit constructions seem hard. It is therefore interesting to ask whether a fixed matrix can be incorporated into a construction of restricted isometries. In this paper, we construct a new broad ensemble of random matrices with dependent entries that satisfy the restricted isometry property. Our construction starts with a fixed (deterministic) matrix

X

satisfying some simple stable rank condition, and we show that the matrix

XR

, where

R

is a random matrix drawn from various popular probabilistic models (including, subgaussian, sparse, low-randomness, satisfying convex concentration property), satisfies the RIP with high probability. These theorems have various applications in signal recovery, random matrix theory, dimensionality reduction, etc. Additionally, motivated by an application for understanding the effectiveness of word vector embeddings popular in natural language processing and machine learning applications, we investigate the RIP of the matrix

XR^{(l)}

where

R^{(l)}

is formed by taking all possible (disregarding order)

l

-way entrywise products of the columns of a random matrix

R

.Comment: 30 pages, fixed minor typo

arXiv.org e-Print Archive

A Relaxation Argument for Optimization in Neural Networks and Non-Convex Compressed Sensing

Author: Welper G.
Publication venue
Publication date: 02/02/2020
Field of study

It has been observed in practical applications and in theoretical analysis that over-parametrization helps to find good minima in neural network training. Similarly, in this article we study widening and deepening neural networks by a relaxation argument so that the enlarged networks are rich enough to run

r

copies of parts of the original network in parallel, without necessarily achieving zero training error as in over-parametrized scenarios. The partial copies can be combined in

r^\theta

possible ways for layer width

\theta

. Therefore, the enlarged networks can potentially achieve the best training error of

r^\theta

random initializations, but it is not immediately clear if this can be realized via gradient descent or similar training methods. The same construction can be applied to other optimization problems by introducing a similar layered structure. We apply this idea to non-convex compressed sensing, where we show that in some scenarios we can realize the

r^\theta

times increased chance to obtain a global optimum by solving a convex optimization problem of dimension

r\theta

arXiv.org e-Print Archive