519 research outputs found

    Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions

    Get PDF
    Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition of an m × n matrix. (i) For a dense input matrix, randomized algorithms require O(mn log(k)) floating-point operations (flops) in contrast to O(mnk) for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multiprocessor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to O(k) passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data

    Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions

    Get PDF
    Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed---either explicitly or implicitly---to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, speed, and robustness. These claims are supported by extensive numerical experiments and a detailed error analysis

    Mini-batch stochastic approaches for accelerated multiplicative updates in nonnegative matrix factorisation with beta-divergence

    Get PDF
    International audienceNonnegative matrix factorisation (NMF) with β-divergence is a popular method to decompose real world data. In this paper we propose mini-batch stochastic algorithms to perform NMF efficiently on large data matrices. Besides the stochastic aspect, the mini-batch approach allows exploiting intensive computing devices such as general purpose graphical processing units to decrease the processing time and in some cases outperform coordinate descent approach

    HPC algorithms for nonnegative decompositions

    Full text link
    Muchos problemas procedentes de aplicaciones del mundo real pueden ser modelados como problemas matemáticos con magnitudes no negativas, y por tanto, las soluciones de estos problemas matemáticos solo tienen sentido si son no negativas. Estas magnitudes no negativas pueden ser, por ejemplo, las frecuencias en una señal sonora, las intensidades de los pixeles de una imagen, etc. Algunos de estos problemas pueden ser modelados utilizando un sistema de ecuaciones lineales sobredeterminado. Cuando la solución de dicho problema debe ser restringida a valores no negativos, aparece un problema llamado problema de mínimos cuadrados no negativos (NNLS por sus siglas en inglés). La solución de dicho problema tiene múltiples aplicaciones en ciencia e ingeniería. Otra descomposición no negativa importante es la Factorización de Matrices No negativas (NMF por sus siglas en inglés). La NMF es una herramienta muy popular utilizada en varios campos, como por ejemplo: clasificación de documentos, aprendizaje automático, análisis de imagen o separación de señales sonoras. Esta factorización intenta aproximar una matriz no negativa con el producto de dos matrices no negativas de menor tamaño, creando habitualmente representaciones por partes de los datos originales. Los algoritmos diseñados para calcular la solución de estos dos problemas no negativos tienen un elevado coste computacional, y debido a ese elevado coste, estas descomposiciones pueden beneficiarse mucho del uso de técnicas de Computación de Altas Prestaciones (HPC por sus siglas en inglés). Estos sistemas computacionales de altas prestaciones incluyen desde los modernos computadores multinucleo a lo último en aceleradores de calculo (Unidades de Procesamiento Gráfico (GPU), Intel Many Integrated Core (MIC), etc.). Para obtener el máximo rendimiento de estos sistemas, los desarrolladores deben utilizar tecnologías software tales como la programación paralela, la vectoración o el uso de librerías de computación altas prestaciones. A pesar de que existen diversos algoritmos para calcular la NMF y resolver el problema NNLS, no todos ellos disponen de una implementación paralela y eficiente. Además, es muy interesante reunir diversos algoritmos con propiedades diferentes en una sola librería computacional. Esta tesis presenta una librería computacional de altas prestaciones que contiene implementaciones paralelas y eficientes de los mejores algoritmos existentes actualmente para calcular la NMF. Además la tesis también incluye una comparación experimental entre las diferentes implementaciones presentadas. Esta librería centrada en el cálculo de la NMF soporta múltiples arquitecturas tales como CPUs multinucleo, GPUs e Intel MIC. El objetivo de esta librería es ofrecer un abanico de algoritmos eficientes para ayudar a científicos, ingenieros o cualquier tipo de profesionales que necesitan hacer uso de la NMF. Otro problema abordado en esta tesis es la actualización de las factorizaciones no negativas. El problema de la actualización se ha estudiado tanto para la solución del problema NNLS como para el calculo de la NMF. Existen problemas no negativos cuya solución es próxima a otros problemas que ya han sido resueltos, el problema de la actualización consiste en aprovechar la solución de un problema A que ya ha sido resuelto, para obtener la solución de un problema B cercano al problema A. Utilizando esta aproximación, el problema B puede ser resuelto más rápido que si se tuviera que resolver sin aprovechar la solución conocida del problema A. En esta tesis se presenta una metodología algorítmica para resolver ambos problemas de actualización: la actualización de la solución del problema NNLS y la actualización de la NMF. Además se presentan evaluaciones empíricas de las soluciones presentadas para ambos problemas. Los resultados de estas evaluaciones muestran que los algoritmos propuestos son más rápidos que resoMolts problemes procedents de aplicacions del mon real poden ser modelats com problemes matemàtics en magnituts no negatives, i per tant, les solucions de estos problemes matemàtics només tenen sentit si son no negatives. Estes magnituts no negatives poden ser, per eixemple, la concentració dels elements en un compost químic, les freqüències en una senyal sonora, les intensitats dels pixels de una image, etc. Alguns d'estos problemes poden ser modelats utilisant un sistema d'equacions llineals sobredeterminat. Quant la solució de este problema deu ser restringida a valors no negatius, apareix un problema nomenat problema de mínims quadrats no negatius (NNLS per les seues sigles en anglés). La solució de este problema te múltiples aplicacions en ciències i ingenieria. Un atra descomposició no negativa important es la Factorisació de Matrius No negatives(NMF per les seues sigles en anglés). La NMF es una ferramenta molt popular utilisada en diversos camps, com per eixemple: classificacio de documents, aprenentage automàtic, anàlisis de image o separació de senyals sonores. Esta factorisació intenta aproximar una matriu no negativa en el producte de dos matrius no negatives de menor tamany, creant habitualment representacions a parts de les dades originals. Els algoritmes dissenyats per a calcular la solució de estos dos problemes no negatius tenen un elevat cost computacional, i degut a este elevat cost, estes descomposicions poden beneficiar-se molt del us de tècniques de Computació de Altes Prestacions (HPC per les seues sigles en anglés). Estos sistemes de computació de altes prestacions inclouen des dels moderns computadors multinucli a lo últim en acceleradors de càlcul (Unitats de Processament Gràfic (GPU), Intel Many Core (MIC), etc.). Per a obtindre el màxim rendiment de estos sistemes, els desenrolladors deuen utilisar tecnologies software tals com la programació paralela, la vectorisació o el us de llibreries de computació de altes prestacions. A pesar de que existixen diversos algoritmes per a calcular la NMF i resoldre el problema NNLS, no tots ells disponen de una implementació paralela i eficient. Ademés, es molt interessant reunir diversos algoritmes en propietats diferents en una sola llibreria computacional. Esta tesis presenta una llibreria computacional de altes prestacions que conté implementacions paraleles i eficients dels millors algoritmes existents per a calcular la NMF. Ademés, la tesis també inclou una comparació experimental entre les diferents implementacions presentades. Esta llibreria centrada en el càlcul de la NMF soporta diverses arquitectures tals com CPUs multinucli, GPUs i Intel MIC. El objectiu de esta llibreria es oferir una varietat de algoritmes eficients per a ajudar a científics, ingeniers o qualsevol tipo de professionals que necessiten utilisar la NMF. Un atre problema abordat en esta tesis es la actualisació de les factorisacions no negatives. El problema de la actualisació se ha estudiat tant per a la solució del problema NNLS com per a el càlcul de la NMF. Existixen problemes no negatius la solució dels quals es pròxima a atres problemes no negatius que ya han sigut resolts, el problema de la actualisació consistix en aprofitar la solució de un problema A que ya ha sigut resolt, per a obtindre la solució de un problema B pròxim al problema A. Utilisant esta aproximació, el problema B pot ser resolt molt mes ràpidament que si tinguera que ser resolt des de 0 sense aprofitar la solució coneguda del problema A. En esta tesis es presenta una metodologia algorítmica per a resoldre els dos problemes de actualisació: la actualisació de la solució del problema NNLS i la actualisació de la NMF. Ademés es presenten evaluacions empíriques de les solucions presentades per als dos problemes. Els resultats de estes evaluacions mostren que els algoritmes proposts son més ràpits que resoldre el problema des de 0 en tots elsMany real world-problems can be modelled as mathematical problems with nonnegative magnitudes, and, therefore, the solutions of these problems are meaningful only if their values are nonnegative. Examples of these nonnegative magnitudes are the concentration of components in a chemical compound, frequencies in an audio signal, pixel intensities on an image, etc. Some of these problems can be modelled to an overdetermined system of linear equations. When the solution of this system of equations should be constrained to nonnegative values, a new problem arises. This problem is called the Nonnegative Least Squares (NNLS) problem, and its solution has multiple applications in science and engineering, especially for solving optimization problems with nonnegative restrictions. Another important nonnegativity constrained decomposition is the Nonnegative Matrix Factorization (NMF). The NMF is a very popular tool in many fields such as document clustering, data mining, machine learning, image analysis, chemical analysis, and audio source separation. This factorization tries to approximate a nonnegative data matrix with the product of two smaller nonnegative matrices, usually creating parts based representations of the original data. The algorithms that are designed to compute the solution of these two nonnegative problems have a high computational cost. Due to this high cost, these decompositions can benefit from the extra performance obtained using High Performance Computing (HPC) techniques. Nowadays, there are very powerful computational systems that offer high performance and can be used to solve extremely complex problems in science and engineering. From modern multicore CPUs to the newest computational accelerators (Graphics Processing Units(GPU), Intel Many Integrated Core(MIC), etc.), the performance of these systems keeps increasing continuously. To make the most of the hardware capabilities of these HPC systems, developers should use software technologies such as parallel programming, vectorization, or high performance computing libraries. While there are several algorithms for computing the NMF and for solving the NNLS problem, not all of them have an efficient parallel implementation available. Furthermore, it is very interesting to group several algorithms with different properties into a single computational library. This thesis presents a high-performance computational library with efficient parallel implementations of the best algorithms to compute the NMF in the current state of the art. In addition, an experimental comparison between the different implementations is presented. This library is focused on the computation of the NMF supporting multiple architectures like multicore CPUs, GPUs and Intel MIC. The goal of the library is to offer a full suit of algorithms to help researchers, engineers or professionals that need to use the NMF. Another problem that is dealt with in this thesis is the updating of nonnegative decompositions. The updating problem has been studied for both the solution of the NNLS problem and the NMF. Sometimes there are nonnegative problems that are close to other nonnegative problems that have already been solved. The updating problem tries to take advantage of the solution of a problem A, that has already been solved in order to obtain a solution of a new problem B, which is closely related to problem A. With this approach, problem B can be solved faster than solving it from scratch and not taking advantage of the already known solution of problem A. In this thesis, an algorithmic scheme is proposed for both the updating of the solution of NNLS problems and the updating of the NMF. Empirical evaluations for both updating problems are also presented. The results show that the proposed algorithms are faster than solving the problems from scratch in all of the tested cases.San Juan Sebastián, P. (2018). HPC algorithms for nonnegative decompositions [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/11306

    Computational Methods for Matrix/Tensor Factorization and Deep Learning Image Denoising

    Get PDF
    Feature learning is a technique to automatically extract features from raw data. It is widely used in areas such as computer vision, image processing, data mining and natural language processing. In this thesis, we are interested in the computational aspects of feature learning. We focus on rank matrix and tensor factorization and deep neural network models for image denoising. With respect to matrix and tensor factorization, we first present a technique to speed up alternating least squares (ALS) and gradient descent (GD) − two commonly used strategies for tensor factorization. We introduce an efficient, scalable and distributed algorithm that addresses the data explosion problem. Instead of a computationally challenging sub-step of ALS and GD, we implement the algorithm on parallel machines by using only two sparse matrix-vector products. Not only is the algorithm scalable but it is also on average 4 to 10 times faster than competing algorithms on various data sets. Next, we discuss our results of non-negative matrix factorization for hyperspectral image data in the presence of noise. We introduce a spectral total variation regularization and derive four variants of the alternating direction method of multiplier algorithm. While all four methods belong to the same family of algorithms, some perform better than others. Thus, we compare the algorithms using stimulated Raman spectroscopic image will be demonstrated. For deep neural network models, we focus on its application to image denoising. We first demonstrate how an optimal procedure leveraging deep neural networks and convex optimization can combine a given set of denoisers to produce an overall better result. The proposed framework estimates the mean squared error (MSE) of individual denoised outputs using a deep neural network; optimally combines the denoised outputs via convex optimization; and recovers lost details of the combined images using another deep neural network. The framework consistently improves denoising performance for both deterministic denoisers and neural network denoisers. Next, we apply the deep neural network to solve the image reconstruction issues of the Quanta Image Sensor (QIS), which is a single-photon image sensor that oversamples the light field to generate binary measures

    Scalable Data Mining via Constrained Low Rank Approximation

    Get PDF
    Matrix and tensor approximation methods are recognised as foundational tools for modern data analytics. Their strength lies in their long history of rigorous and principled theoretical foundations, judicious formulations via various constraints, along with the availability of fast computer programs. Multiple Constrained Low Rank Approximation (CLRA) formulations exist for various commonly encountered tasks like clustering, dimensionality reduction, anomaly detection, amongst others. The primary challenge in modern data analytics is the sheer volume of data to be analysed, often requiring multiple machines to just hold the dataset in memory. This dissertation presents CLRA as a key enabler of scalable data mining in distributed-memory parallel machines. Nonnegative Matrix Factorisation (NMF) is the primary CLRA method studied in this dissertation. NMF imposes nonnegativity constraints on the factor matrices and is a well studied formulation known for its simplicity, interpretability, and clustering prowess. The major bottleneck in most NMF algorithms is a distributed matrix-multiplication kernel. We develop the Parallel Low rank Approximation with Nonnegativity Constraints (PLANC) software package, building on the earlier MPI-FAUN library, which includes an efficient matrix-multiplication kernel tailored to the CLRA case. It employs carefully designed parallel algorithms and data distributions to avoid unnecessary computation and communication. We extend PLANC to include several optimised Nonnegative Least-Squares (NLS) solvers and symmetric constraints, effectively employing the optimised matrix-multiplication kernel. We develop a parallel inexact Gauss-Newton algorithm for Symmetric Nonnegative Matrix Factorisation (SymNMF). In particular PLANC is able to efficiently utilise second-order information when imposing symmetry constraints without incurring the prohibitive memory and computational costs associated with these methods. We are able to observe 70% efficiency while scaling up these methods. We develop new parallel algorithms for fusing and analysing data with multiple modalities in the Joint Nonnegative Matrix Factorisation (JointNMF) context. JointNMF is capable of knowledge discovery when both feature-data and data-data information is present in a data source. We extend PLANC to handle this case of simultaneously approximating two different large input matrices and study the various trade-offs encountered in the bottleneck matrix-multiplication kernel. We show that these ideas translate naturally to the multilinear setting when data is presented in the form of a tensor. A bottleneck computation analogous to the matrix-multiply, the Matricised-Tensor Times Khatri-Rao Product (MTTKRP) kernel, is implemented. We conclude by describing some avenues for future research which extend the work and ideas in this dissertation. In particular, we consider the notion of structured sparsity, where the user has some control over the nonzero pattern, which appears in computations for various tasks like cross-validation, working with missing values, robust CLRA models, and in the semi-supervised setting.Ph.D

    Novel gradient-based methods for data distribution and privacy in data science

    Get PDF
    With an increase in the need of storing data at different locations, designing algorithms that can analyze distributed data is becoming more important. In this thesis, we present several gradient-based algorithms, which are customized for data distribution and privacy. First, we propose a provably convergent, second order incremental and inherently parallel algorithm. The proposed algorithm works with distributed data. By using a local quadratic approximation, we achieve to speed-up the convergence with the help of curvature information. We also illustrate that the parallel implementation of our algorithm performs better than a parallel stochastic gradient descent method to solve a large-scale data science problem. This first algorithm solves the problem of using data that resides at different locations. However, this setting is not necessarily enough for data privacy. To guarantee the privacy of the data, we propose differentially private optimization algorithms in the second part of the thesis. The first one among them employs a smoothing approach which is based on using the weighted averages of the history of gradients. This approach helps to decrease the variance of the noise. This reduction in the variance is important for iterative optimization algorithms, since increasing the amount of noise in the algorithm can harm the performance. We also present differentially private version of a recent multistage accelerated algorithm. These extensions use noise related parameter selection and the proposed stepsizes are proportional to the variance of the noisy gradient. The numerical experiments show that our algorithms show a better performance than some well-known differentially private algorithm
    corecore