5 research outputs found

    Combinatorial Methods in Coding Theory

    Get PDF
    This thesis is devoted to a range of questions in applied mathematics and signal processing motivated by applications in error correction, compressed sensing, and writing on non-volatile memories. The underlying thread of our results is the use of diverse combinatorial methods originating in coding theory and computer science. The thesis addresses three groups of problems. The first of them is aimed at the construction and analysis of codes for error correction. Here we examine properties of codes that are constructed using random and structured graphs and hypergraphs, with the main purpose of devising new decoding algorithms as well as estimating the distribution of Hamming weights in the resulting codes. Some of the results obtained give the best known estimates of the number of correctable errors for codes whose decoding relies on local operations on the graph. In the second part we address the question of constructing sampling operators for the compressed sensing problem. This topic has been the subject of a large body of works in the literature. We propose general constructions of sampling matrices based on ideas from coding theory that act as near-isometric maps on almost all sparse signal. This matrices can be used for dimensionality reduction and compressed sensing. In the third part we study the problem of reliable storage of information in non-volatile memories such as flash drives. This problem gives rise to a writing scheme that relies on relative magnitudes of neighboring cells, known as rank modulation. We establish the exact asymptotic behavior of the size of codes for rank modulation and suggest a number of new general constructions of such codes based on properties of finite fields as well as combinatorial considerations

    Global Geometric Conditions on Sensing Matrices for the Success of L1 Minimization Algorithm

    Get PDF
    Compressed Sensing concerns a new class of linear data acquisition protocols that are more efficient than the classical Shannon sampling theorem when targeting at signals with sparse structures. In this thesis, we study the stability of a Statistical Restricted Isometry Property and show how this property can be further relaxed while maintaining its sufficiency for the Basis Pursuit algorithm to recover sparse signals. We then look at the dictionary extension of Compressed Sensing where signals are sparse under a redundant dictionary and reconstruction is achieved by the β„“1\ell_1 synthesis method. By establishing a necessary and sufficient condition for the stability of β„“1\ell_1 synthesis, we are able to predict this algorithm's performances under different dictionaries. Last, we construct a class of deterministic sensing matrix for the Dirac-Fourier joint dictionary

    Algorithmic advances in learning from large dimensional matrices and scientific data

    Get PDF
    University of Minnesota Ph.D. dissertation.May 2018. Major: Computer Science. Advisor: Yousef Saad. 1 computer file (PDF); xi, 196 pages.This thesis is devoted to answering a range of questions in machine learning and data analysis related to large dimensional matrices and scientific data. Two key research objectives connect the different parts of the thesis: (a) development of fast, efficient, and scalable algorithms for machine learning which handle large matrices and high dimensional data; and (b) design of learning algorithms for scientific data applications. The work combines ideas from multiple, often non-traditional, fields leading to new algorithms, new theory, and new insights in different applications. The first of the three parts of this thesis explores numerical linear algebra tools to develop efficient algorithms for machine learning with reduced computation cost and improved scalability. Here, we first develop inexpensive algorithms combining various ideas from linear algebra and approximation theory for matrix spectrum related problems such as numerical rank estimation, matrix function trace estimation including log-determinants, Schatten norms, and other spectral sums. We also propose a new method which simultaneously estimates the dimension of the dominant subspace of covariance matrices and obtains an approximation to the subspace. Next, we consider matrix approximation problems such as low rank approximation, column subset selection, and graph sparsification. We present a new approach based on multilevel coarsening to compute these approximations for large sparse matrices and graphs. Lastly, on the linear algebra front, we devise a novel algorithm based on rank shrinkage for the dictionary learning problem, learning a small set of dictionary columns which best represent the given data. The second part of this thesis focuses on exploring novel non-traditional applications of information theory and codes, particularly in solving problems related to machine learning and high dimensional data analysis. Here, we first propose new matrix sketching methods using codes for obtaining low rank approximations of matrices and solving least squares regression problems. Next, we demonstrate that codewords from certain coding scheme perform exceptionally well for the group testing problem. Lastly, we present a novel machine learning application for coding theory, that of solving large scale multilabel classification problems. We propose a new algorithm for multilabel classification which is based on group testing and codes. The algorithm has a simple inexpensive prediction method, and the error correction capabilities of codes are exploited for the first time to correct prediction errors. The third part of the thesis focuses on devising robust and stable learning algorithms, which yield results that are interpretable from specific scientific application viewpoint. We present Union of Intersections (UoI), a flexible, modular, and scalable framework for statistical-machine learning problems. We then adapt this framework to develop new algorithms for matrix decomposition problems such as nonnegative matrix factorization (NMF) and CUR decomposition. We apply these new methods to data from Neuroscience applications in order to obtain insights into the functionality of the brain. Finally, we consider the application of material informatics, learning from materials data. Here, we deploy regression techniques on materials data to predict physical properties of materials