2 research outputs found

    Restricted Isometry Property under High Correlations

    Full text link
    Matrices satisfying the Restricted Isometry Property (RIP) play an important role in the areas of compressed sensing and statistical learning. RIP matrices with optimal parameters are mainly obtained via probabilistic arguments, as explicit constructions seem hard. It is therefore interesting to ask whether a fixed matrix can be incorporated into a construction of restricted isometries. In this paper, we construct a new broad ensemble of random matrices with dependent entries that satisfy the restricted isometry property. Our construction starts with a fixed (deterministic) matrix XX satisfying some simple stable rank condition, and we show that the matrix XRXR, where RR is a random matrix drawn from various popular probabilistic models (including, subgaussian, sparse, low-randomness, satisfying convex concentration property), satisfies the RIP with high probability. These theorems have various applications in signal recovery, random matrix theory, dimensionality reduction, etc. Additionally, motivated by an application for understanding the effectiveness of word vector embeddings popular in natural language processing and machine learning applications, we investigate the RIP of the matrix XR(l)XR^{(l)} where R(l)R^{(l)} is formed by taking all possible (disregarding order) ll-way entrywise products of the columns of a random matrix RR.Comment: 30 pages, fixed minor typo

    A Relaxation Argument for Optimization in Neural Networks and Non-Convex Compressed Sensing

    Full text link
    It has been observed in practical applications and in theoretical analysis that over-parametrization helps to find good minima in neural network training. Similarly, in this article we study widening and deepening neural networks by a relaxation argument so that the enlarged networks are rich enough to run rr copies of parts of the original network in parallel, without necessarily achieving zero training error as in over-parametrized scenarios. The partial copies can be combined in rθr^\theta possible ways for layer width θ\theta. Therefore, the enlarged networks can potentially achieve the best training error of rθr^\theta random initializations, but it is not immediately clear if this can be realized via gradient descent or similar training methods. The same construction can be applied to other optimization problems by introducing a similar layered structure. We apply this idea to non-convex compressed sensing, where we show that in some scenarios we can realize the rθr^\theta times increased chance to obtain a global optimum by solving a convex optimization problem of dimension rθr\theta
    corecore