Search CORE

13 research outputs found

HPC algorithms for nonnegative decompositions

Author: San Juan Sebastián Pablo
Publication venue: 'Universitat Politecnica de Valencia'
Publication date: 26/11/2018
Field of study

Muchos problemas procedentes de aplicaciones del mundo real pueden ser modelados como problemas matemáticos con magnitudes no negativas, y por tanto, las soluciones de estos problemas matemáticos solo tienen sentido si son no negativas. Estas magnitudes no negativas pueden ser, por ejemplo, las frecuencias en una señal sonora, las intensidades de los pixeles de una imagen, etc. Algunos de estos problemas pueden ser modelados utilizando un sistema de ecuaciones lineales sobredeterminado. Cuando la solución de dicho problema debe ser restringida a valores no negativos, aparece un problema llamado problema de mínimos cuadrados no negativos (NNLS por sus siglas en inglés). La solución de dicho problema tiene múltiples aplicaciones en ciencia e ingeniería. Otra descomposición no negativa importante es la Factorización de Matrices No negativas (NMF por sus siglas en inglés). La NMF es una herramienta muy popular utilizada en varios campos, como por ejemplo: clasificación de documentos, aprendizaje automático, análisis de imagen o separación de señales sonoras. Esta factorización intenta aproximar una matriz no negativa con el producto de dos matrices no negativas de menor tamaño, creando habitualmente representaciones por partes de los datos originales. Los algoritmos diseñados para calcular la solución de estos dos problemas no negativos tienen un elevado coste computacional, y debido a ese elevado coste, estas descomposiciones pueden beneficiarse mucho del uso de técnicas de Computación de Altas Prestaciones (HPC por sus siglas en inglés). Estos sistemas computacionales de altas prestaciones incluyen desde los modernos computadores multinucleo a lo último en aceleradores de calculo (Unidades de Procesamiento Gráfico (GPU), Intel Many Integrated Core (MIC), etc.). Para obtener el máximo rendimiento de estos sistemas, los desarrolladores deben utilizar tecnologías software tales como la programación paralela, la vectoración o el uso de librerías de computación altas prestaciones. A pesar de que existen diversos algoritmos para calcular la NMF y resolver el problema NNLS, no todos ellos disponen de una implementación paralela y eficiente. Además, es muy interesante reunir diversos algoritmos con propiedades diferentes en una sola librería computacional. Esta tesis presenta una librería computacional de altas prestaciones que contiene implementaciones paralelas y eficientes de los mejores algoritmos existentes actualmente para calcular la NMF. Además la tesis también incluye una comparación experimental entre las diferentes implementaciones presentadas. Esta librería centrada en el cálculo de la NMF soporta múltiples arquitecturas tales como CPUs multinucleo, GPUs e Intel MIC. El objetivo de esta librería es ofrecer un abanico de algoritmos eficientes para ayudar a científicos, ingenieros o cualquier tipo de profesionales que necesitan hacer uso de la NMF. Otro problema abordado en esta tesis es la actualización de las factorizaciones no negativas. El problema de la actualización se ha estudiado tanto para la solución del problema NNLS como para el calculo de la NMF. Existen problemas no negativos cuya solución es próxima a otros problemas que ya han sido resueltos, el problema de la actualización consiste en aprovechar la solución de un problema A que ya ha sido resuelto, para obtener la solución de un problema B cercano al problema A. Utilizando esta aproximación, el problema B puede ser resuelto más rápido que si se tuviera que resolver sin aprovechar la solución conocida del problema A. En esta tesis se presenta una metodología algorítmica para resolver ambos problemas de actualización: la actualización de la solución del problema NNLS y la actualización de la NMF. Además se presentan evaluaciones empíricas de las soluciones presentadas para ambos problemas. Los resultados de estas evaluaciones muestran que los algoritmos propuestos son más rápidos que resoMolts problemes procedents de aplicacions del mon real poden ser modelats com problemes matemàtics en magnituts no negatives, i per tant, les solucions de estos problemes matemàtics només tenen sentit si son no negatives. Estes magnituts no negatives poden ser, per eixemple, la concentració dels elements en un compost químic, les freqüències en una senyal sonora, les intensitats dels pixels de una image, etc. Alguns d'estos problemes poden ser modelats utilisant un sistema d'equacions llineals sobredeterminat. Quant la solució de este problema deu ser restringida a valors no negatius, apareix un problema nomenat problema de mínims quadrats no negatius (NNLS per les seues sigles en anglés). La solució de este problema te múltiples aplicacions en ciències i ingenieria. Un atra descomposició no negativa important es la Factorisació de Matrius No negatives(NMF per les seues sigles en anglés). La NMF es una ferramenta molt popular utilisada en diversos camps, com per eixemple: classificacio de documents, aprenentage automàtic, anàlisis de image o separació de senyals sonores. Esta factorisació intenta aproximar una matriu no negativa en el producte de dos matrius no negatives de menor tamany, creant habitualment representacions a parts de les dades originals. Els algoritmes dissenyats per a calcular la solució de estos dos problemes no negatius tenen un elevat cost computacional, i degut a este elevat cost, estes descomposicions poden beneficiar-se molt del us de tècniques de Computació de Altes Prestacions (HPC per les seues sigles en anglés). Estos sistemes de computació de altes prestacions inclouen des dels moderns computadors multinucli a lo últim en acceleradors de càlcul (Unitats de Processament Gràfic (GPU), Intel Many Core (MIC), etc.). Per a obtindre el màxim rendiment de estos sistemes, els desenrolladors deuen utilisar tecnologies software tals com la programació paralela, la vectorisació o el us de llibreries de computació de altes prestacions. A pesar de que existixen diversos algoritmes per a calcular la NMF i resoldre el problema NNLS, no tots ells disponen de una implementació paralela i eficient. Ademés, es molt interessant reunir diversos algoritmes en propietats diferents en una sola llibreria computacional. Esta tesis presenta una llibreria computacional de altes prestacions que conté implementacions paraleles i eficients dels millors algoritmes existents per a calcular la NMF. Ademés, la tesis també inclou una comparació experimental entre les diferents implementacions presentades. Esta llibreria centrada en el càlcul de la NMF soporta diverses arquitectures tals com CPUs multinucli, GPUs i Intel MIC. El objectiu de esta llibreria es oferir una varietat de algoritmes eficients per a ajudar a científics, ingeniers o qualsevol tipo de professionals que necessiten utilisar la NMF. Un atre problema abordat en esta tesis es la actualisació de les factorisacions no negatives. El problema de la actualisació se ha estudiat tant per a la solució del problema NNLS com per a el càlcul de la NMF. Existixen problemes no negatius la solució dels quals es pròxima a atres problemes no negatius que ya han sigut resolts, el problema de la actualisació consistix en aprofitar la solució de un problema A que ya ha sigut resolt, per a obtindre la solució de un problema B pròxim al problema A. Utilisant esta aproximació, el problema B pot ser resolt molt mes ràpidament que si tinguera que ser resolt des de 0 sense aprofitar la solució coneguda del problema A. En esta tesis es presenta una metodologia algorítmica per a resoldre els dos problemes de actualisació: la actualisació de la solució del problema NNLS i la actualisació de la NMF. Ademés es presenten evaluacions empíriques de les solucions presentades per als dos problemes. Els resultats de estes evaluacions mostren que els algoritmes proposts son més ràpits que resoldre el problema des de 0 en tots elsMany real world-problems can be modelled as mathematical problems with nonnegative magnitudes, and, therefore, the solutions of these problems are meaningful only if their values are nonnegative. Examples of these nonnegative magnitudes are the concentration of components in a chemical compound, frequencies in an audio signal, pixel intensities on an image, etc. Some of these problems can be modelled to an overdetermined system of linear equations. When the solution of this system of equations should be constrained to nonnegative values, a new problem arises. This problem is called the Nonnegative Least Squares (NNLS) problem, and its solution has multiple applications in science and engineering, especially for solving optimization problems with nonnegative restrictions. Another important nonnegativity constrained decomposition is the Nonnegative Matrix Factorization (NMF). The NMF is a very popular tool in many fields such as document clustering, data mining, machine learning, image analysis, chemical analysis, and audio source separation. This factorization tries to approximate a nonnegative data matrix with the product of two smaller nonnegative matrices, usually creating parts based representations of the original data. The algorithms that are designed to compute the solution of these two nonnegative problems have a high computational cost. Due to this high cost, these decompositions can benefit from the extra performance obtained using High Performance Computing (HPC) techniques. Nowadays, there are very powerful computational systems that offer high performance and can be used to solve extremely complex problems in science and engineering. From modern multicore CPUs to the newest computational accelerators (Graphics Processing Units(GPU), Intel Many Integrated Core(MIC), etc.), the performance of these systems keeps increasing continuously. To make the most of the hardware capabilities of these HPC systems, developers should use software technologies such as parallel programming, vectorization, or high performance computing libraries. While there are several algorithms for computing the NMF and for solving the NNLS problem, not all of them have an efficient parallel implementation available. Furthermore, it is very interesting to group several algorithms with different properties into a single computational library. This thesis presents a high-performance computational library with efficient parallel implementations of the best algorithms to compute the NMF in the current state of the art. In addition, an experimental comparison between the different implementations is presented. This library is focused on the computation of the NMF supporting multiple architectures like multicore CPUs, GPUs and Intel MIC. The goal of the library is to offer a full suit of algorithms to help researchers, engineers or professionals that need to use the NMF. Another problem that is dealt with in this thesis is the updating of nonnegative decompositions. The updating problem has been studied for both the solution of the NNLS problem and the NMF. Sometimes there are nonnegative problems that are close to other nonnegative problems that have already been solved. The updating problem tries to take advantage of the solution of a problem A, that has already been solved in order to obtain a solution of a new problem B, which is closely related to problem A. With this approach, problem B can be solved faster than solving it from scratch and not taking advantage of the already known solution of problem A. In this thesis, an algorithmic scheme is proposed for both the updating of the solution of NNLS problems and the updating of the NMF. Empirical evaluations for both updating problems are also presented. The results show that the proposed algorithms are faster than solving the problems from scratch in all of the tested cases.San Juan Sebastián, P. (2018). HPC algorithms for nonnegative decompositions [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/11306

RiuNet

Recommended from our members

Building Rank-Revealing Factorizations with Randomization

Author: Heavner Nathan
Publication venue: University of Colorado Boulder
Publication date: 01/01/2019
Field of study

This thesis describes a set of randomized algorithms for computing rank revealing factorizations of matrices. These algorithms are designed specifically to minimize the amount of data movement required, which is essential to high practical performance on modern computing hardware. The work presented builds on existing randomized algorithms for computing low-rank approximations to matrices, but essentially ex- tends the range of applicability of these methods by allowing for the efficient decomposition of matrices of any numerical rank, including full rank matrices. In contrast, existing methods worked well only when the numerical rank was substantially smaller than the dimensions of the matrix.The thesis describes algorithms for computing two of the most popular rank-revealing matrix decom- positions: the column pivoted QR (CPQR) decomposition, and the so called UTV decomposition that factors a given matrix A as A = UTV∗, where U and V have orthonormal columns and T is triangular. For each algorithm, the thesis presents algorithms that are tailored for different computing environments, including multicore shared memory processors, GPUs, distributed memory machines, and matrices that are stored on hard drives (“out of core”).The first chapter of the thesis consists of an introduction that provides context, reviews previous work in the field, and summarizes the key contributions. Beside the introduction, the thesis contains six additional chapters:Chapter 2 introduces a fully blocked algorithm HQRRP for computing a QR factorization with col- umn pivoting. The key to the full blocking of the algorithm lies in using randomized projections to create a low dimensional sketch of the data, where multiple good pivot columns may be cheaply computed. Nu- merical experiments show that HQRRP is several times faster than the classical algorithm for computing a column pivoted QR on a multicore machine, and the acceleration factor increases with the number of cores.Chapter 3 introduces randUTV, a randomized algorithm for computing a rank-revealing factorizationof the form A = UTV∗, where U and V are orthogonal and T is upper triangular. RandUTV uses random- ized methods to efficiently build U and V as approximations of the column and row spaces of A. The result is an algorithm that reveals rank nearly as well as the SVD and costs at most as much as a column pivoted QR.Chapter 4 provides optimized implementations for shared and distributed memory architectures. For shared memory, we show that formulating randUTV as an algorithm-by-blocks increases its efficiency in parallel. The fifth chapter implements randUTV on the GPU and augments the algorithm with an over- sampling technique to further increase the low rank approximation properties of the resulting factorization. Chapter 6 implements both randUTV and HQRRP for use with matrices stored out of core. It is shown that reorganizing HQRRP as a left-looking algorithm to reduce the number of writes to the drive is in the tested cases necessary for the scalability of the algorithm when using spinning disk storage. Finally, chapter 7 discusses an alternative use for randUTV as a nuclear norm estimator and measures the acceleration gained from trimming down the algorithm when only singular value estimates are required

CU Scholar Institutional Repository

Control and Estimation Oriented Model Order Reduction for Linear and Nonlinear Systems

Author: Choroszucha Richard
Publication venue
Publication date: 01/01/2017
Field of study

Optimization based controls are advantageous in meeting stringent performance requirements and accommodating constraints. Although computers are becoming more powerful, solving optimization problems in real-time remains an obstacle because of associated computational complexity. Research efforts to address real-time optimization with limited computational power have intensified over the last decade, and one direction that has shown some success is model order reduction. This dissertation contains a collection of results relating to open- and closed-loop reduction techniques for large scale unconstrained linear descriptor systems, constrained linear systems, and nonlinear systems. For unconstrained linear descriptor systems, this dissertation develops novel gramian and Riccati solution approximation techniques. The gramian approximation is used for an open-loop reduction technique following that of balanced truncation proposed by (Moore, 1981) for ordinary linear systems and (Stykel, 2004) for linear descriptor systems. The Riccati solution is used to generalize the Linear Quadratic Gaussian balanced truncation (LQGBT) of (Verriest, 1981) and (Jonckheere and Silverman, 1983). These are applied to an electric machine model to reduce the number of states from

>

100000 to 8 while improving accuracy over the state-of-the-art modal truncation of (Zhou, 2015) for the purpose of condition monitoring. Furthermore, a link between unconstrained model predictive control (MPC) with a terminal penalty and LQG of a linear system is noted, suggesting an LQGBT reduced model as a natural model for reduced MPC design. The efficacy of such a reduced controller is demonstrated by the real-time control of a diesel airpath. Model reduction generally introduces modeling errors, and controlling a constrained plant subject to modeling errors falls squarely into robust control. A standard assumption of robust control is that inputs/states/outputs are constrained by convex sets, and these sets are ``tightened'' for robust constraint satisfaction. However, robust control is often overly conservative, and resulting control strategies cannot take advantage of the true admissible sets. A new reduction problem is proposed that considers the reduced order model accuracy and constraint conservativeness. A constant tube methodology for reduced order constrained MPC is presented, and the proposed reduced order model is found to decrease the constraint conservativeness of the reduced order MPC law compared to reduced order models obtained by gramian and LQG reductions. For nonlinear systems, a reformulation of the empirical gramians of (Lall et al., 1999) and (Hahn et al., 2003) into simpler, yet more general forms is provided. The modified definitions are used in the balanced truncation of a nonlinear diesel airpath model, and the reduced order model is used to design a reduced MPC law for tracking control. Further exploiting the link between the gramian and Riccati solution for linear systems, the new empirical gramian formulation is extended to obtain empirical Riccati covariance matrices used for closed-loop model order reduction of a nonlinear system. Balanced truncation using the empirical Riccati covariance matrices is demonstrated to result in a closer-to-optimal nonlinear compensator than the previous balanced truncation techniques discussed in the dissertation.PHDNaval Architecture & Marine EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/140839/1/riboch_1.pd

Deep Blue Documents at the University of Michigan

Implementation of QR Up- and Downdating on a Massively Parallel Computer

Author: Claus Bendtsen
Hans Bruun Nielsen
Hansen Kaj Madsen
Mustafa Pinar
Per Christian
Per Christian Hansen
Publication venue
Publication date
Field of study

We describe an implementation of QR up- and downdating on a massively parallel computer (the Connection Machine CM--200) and show that the algorithm maps well onto the computer. In particular, we show how the use of corrected semi-normal equations for downdating can be efficiently implemented. We also illustrate the use of our algorithms in a new LP algorithm. Key words. up- and downdating of QR factorization, corrected seminormal equations, CM--200. 1 Introduction In this paper we describe an efficient implementation of updating and downdating of a QR factorization on the Connection Machine CM--200, which is a massively parallel SIMD computer [11]. Many of our considerations are general for massively parallel computers. This project was sponsored by the Danish Center for Parallel Computer Research. M. Pinar was also sponsored by the Danish Natural Science Research Council, Grant No. 11-0505. y UNIfflC (Danish Computing Centre for Research and Education), Building 305, Technical..

CiteSeerX

Closed Likelihood Ratio Testing Procedures to Assess Similarity of Covariance Matrices

Author: Akaike H.
Anderson T. W.
Antonio Punzo
Bagnato L.
Bensmail H.
Biernacki C.
Boente G.
Bozdogan H.
Bozdogan H.
Bretz F.
Campbell N. A.
Cavanaugh J. E.
Celeux G.
Christensen R.
Emerson S.
Fisher R. A.
Flury B. N.
Flury B. N.
Flury B. N.
Flury B. N.
Francesca Greselin
Giancristofaro Arboretti R.
Greselin F.
Hallin M.
Hochberg Y.
Holm S.
Jolicoeur P.
Manly B. F. J.
Marcus R.
R Development Core Team
Rencher A. C.
Schmidt-Nielsen K.
Schwarz G.
Westfall P.
Westfall P. H.
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref

Control of noise-induced behavior in neural network

Author: Janson Natalia
Patidar Sandhya
Pototsky Andrey
Publication venue
Publication date: 01/07/2007
Field of study

Heriot Watt Pure

Control of noise-induced behavior in neural network

Author: Janson Natalia
Patidar Sandhya
Pototsky Andrey
Publication venue
Publication date: 01/07/2007
Field of study

Heriot Watt Pure

Implementation of QR up- and downdating on a massively parallel |computer

Author: Bendtsen Claus
Hansen Per Christian
Madsen Kaj
Nielsen Hans Bruun
Pinar M. C.
Publication venue
Publication date: 01/01/1995
Field of study

Online Research Database In Technology

Head-Driven Phrase Structure Grammar

Author
Publication venue: Language Science Press
Publication date: 27/01/2022
Field of study

Head-Driven Phrase Structure Grammar (HPSG) is a constraint-based or declarative approach to linguistic knowledge, which analyses all descriptive levels (phonology, morphology, syntax, semantics, pragmatics) with feature value pairs, structure sharing, and relational constraints. In syntax it assumes that expressions have a single relatively simple constituent structure. This volume provides a state-of-the-art introduction to the framework. Various chapters discuss basic assumptions and formal foundations, describe the evolution of the framework, and go into the details of the main syntactic phenomena. Further chapters are devoted to non-syntactic levels of description. The book also considers related fields and research areas (gesture, sign languages, computational linguistics) and includes chapters comparing HPSG with other frameworks (Lexical Functional Grammar, Categorial Grammar, Construction Grammar, Dependency Grammar, and Minimalism)

Directory of Open Access Books (DOAB)