4 research outputs found

    Accurate Energy and Performance Prediction for Frequency-Scaled GPU Kernels

    Get PDF
    Energy optimization is an increasingly important aspect of today’s high-performance computing applications. In particular, dynamic voltage and frequency scaling (DVFS) has become a widely adopted solution to balance performance and energy consumption, and hardware vendors provide management libraries that allow the programmer to change both memory and core frequencies manually to minimize energy consumption while maximizing performance. This article focuses on modeling the energy consumption and speedup of GPU applications while using different frequency configurations. The task is not straightforward, because of the large set of possible and uniformly distributed configurations and because of the multi-objective nature of the problem, which minimizes energy consumption and maximizes performance. This article proposes a machine learning-based method to predict the best core and memory frequency configurations on GPUs for an input OpenCL kernel. The method is based on two models for speedup and normalized energy predictions over the default frequency configuration. Those are later combined into a multi-objective approach that predicts a Pareto-set of frequency configurations. Results show that our approach is very accurate at predicting extema and the Pareto set, and finds frequency configurations that dominate the default configuration in either energy or performance.DFG, 360291326, CELERITY: Innovative Modellierung für Skalierbare Verteilte Laufzeitsystem

    Analysis of stochastic probing methods for estimating the trace of functions of sparse symmetric matrices

    Full text link
    We consider the problem of estimating the trace of a matrix function f(A)f(A). In certain situations, in particular if f(A)f(A) cannot be well approximated by a low-rank matrix, combining probing methods based on graph colorings with stochastic trace estimation techniques can yield accurate approximations at moderate cost. So far, such methods have not been thoroughly analyzed, though, but were rather used as efficient heuristics by practitioners. In this manuscript, we perform a detailed analysis of stochastic probing methods and, in particular, expose conditions under which the expected approximation error in the stochastic probing method scales more favorably with the dimension of the matrix than the error in non-stochastic probing. Extending results from [E. Aune, D. P. Simpson, J. Eidsvik, Parameter estimation in high dimensional Gaussian distributions, Stat. Comput., 24, pp. 247--263, 2014], we also characterize situations in which using just one stochastic vector is always -- not only in expectation -- better than the deterministic probing method. Several numerical experiments illustrate our theory and compare with existing methods

    FAST AND MEMORY EFFICIENT ALGORITHMS FOR STRUCTURED MATRIX SPECTRUM APPROXIMATION

    Get PDF
    Approximating the singular values or eigenvalues of a matrix, i.e. spectrum approximation, is a fundamental task in data science and machine learning applications. While approximation of the top singular values has received considerable attention in numerical linear algebra, provably efficient algorithms for other spectrum approximation tasks such as spectral-sum estimation and spectrum density estimation are starting to emerge only recently. Two crucial components that have enabled efficient algorithms for spectrum approximation are access to randomness and structure in the underlying matrix. In this thesis, we study how randomization and the underlying structure of the matrix can be exploited to design fast and memory efficient algorithms for spectral sum-estimation and spectrum density estimation. In particular, we look at two classes of structure: sparsity and graph structure. In the first part of this thesis, we show that sparsity can be exploited to give low-memory algorithms for spectral summarization tasks such as approximating some Schatten norms, the Estrada index and the logarithm of the determinant (log-det) of a sparse matrix. Surprisingly, we show that the space complexity of our algorithms are independent of the underlying dimension of the matrix. Complimenting our result for sparse matrices, we show that matrices that satisfy a certain smooth definition of sparsity, but potentially dense in the conventional sense, can be approximated in spectral-norm error by a truly sparse matrix. Our method is based on a simple sampling scheme that can be implemented in linear time. In the second part, we give the first truly sublinear time algorithm to approximate the spectral density of the (normalized) adjacency matrix of an undirected, unweighted graph in earth-mover distance. In addition to our sublinear time result, we give theoretical guarantees for a variant of the widely-used Kernel Polynomial Method and propose a new moment matching based method for spectrum density estimation of Hermitian matrices
    corecore