157 research outputs found
A Survey on Intelligent Iterative Methods for Solving Sparse Linear Algebraic Equations
Efficiently solving sparse linear algebraic equations is an important
research topic of numerical simulation. Commonly used approaches include direct
methods and iterative methods. Compared with the direct methods, the iterative
methods have lower computational complexity and memory consumption, and are
thus often used to solve large-scale sparse linear equations. However, there
are numerous iterative methods, parameters and components needed to be
carefully chosen, and an inappropriate combination may eventually lead to an
inefficient solution process in practice. With the development of deep
learning, intelligent iterative methods become popular in these years, which
can intelligently make a sufficiently good combination, optimize the parameters
and components in accordance with the properties of the input matrix. This
survey then reviews these intelligent iterative methods. To be clearer, we
shall divide our discussion into three aspects: a method aspect, a component
aspect and a parameter aspect. Moreover, we summarize the existing work and
propose potential research directions that may deserve a deep investigation
AutoAMG(): An Auto-tuned AMG Method Based on Deep Learning for Strong Threshold
Algebraic Multigrid (AMG) is one of the most used iterative algorithms for
solving large sparse linear equations . In AMG, the coarse grid is a key
component that affects the efficiency of the algorithm, the construction of
which relies on the strong threshold parameter . This parameter is
generally chosen empirically, with a default value in many current AMG solvers
of 0.25 for 2D problems and 0.5 for 3D problems. However, for many practical
problems, the quality of the coarse grid and the efficiency of the AMG
algorithm are sensitive to ; the default value is rarely optimal, and
sometimes is far from it. Therefore, how to choose a better is an
important question. In this paper, we propose a deep learning based auto-tuning
method, AutoAMG() for multiscale sparse linear equations, which are
widely used in practical problems. The method uses Graph Neural Networks (GNNs)
to extract matrix features, and a Multilayer Perceptron (MLP) to build the
mapping between matrix features and the optimal , which can adaptively
output values for different matrices. Numerical experiments show that
AutoAMG() can achieve significant speedup compared to the default
value
Language and Compiler Support for Auto-Tuning Variable-Accuracy Algorithms
Approximating ideal program outputs is a common technique for solving computationally difficult problems, for adhering to processing or timing constraints, and for performance optimization in situations where perfect precision is not necessary. To this end, programmers often use approximation algorithms, iterative methods, data resampling, and other heuristics. However, programming such variable accuracy algorithms presents difficult challenges since the optimal algorithms and parameters may change with different accuracy requirements and usage environments. This problem is further compounded when multiple variable accuracy algorithms are nested together due to the complex way that accuracy requirements can propagate across algorithms and because of the size of the set of allowable compositions. As a result, programmers often deal with this issue in an ad-hoc manner that can sometimes violate sound programming practices such as maintaining library abstractions. In this paper, we propose language extensions that expose trade-offs between time and accuracy to the compiler. The compiler performs fully automatic compile-time and installtime autotuning and analyses in order to construct optimized algorithms to achieve any given target accuracy. We present novel compiler techniques and a structured genetic tuning algorithm to search the space of candidate algorithms and accuracies in the presence of recursion and sub-calls to other variable accuracy code. These techniques benefit both the library writer, by providing an easy way to describe and search the parameter and algorithmic choice space, and the library user, by allowing high level specification of accuracy requirements which are then met automatically without the need for the user to understand any algorithm-specific parameters. Additionally, we present a new suite of benchmarks, written in our language, to examine the efficacy of our techniques. Our experimental results show that by relaxing accuracy requirements , we can easily obtain performance improvements ranging from 1.1× to orders of magnitude of speedup
Language and Compiler Support for Auto-Tuning Variable-Accuracy Algorithms
Approximating ideal program outputs is a common technique for solving computationally difficult problems, for adhering to processing or timing constraints, and for performance optimization in situations where perfect precision is not necessary. To this end, programmers often use approximation algorithms, iterative methods, data resampling, and other heuristics. However, programming such variable accuracy algorithms presents difficult challenges since the optimal algorithms and parameters may change with different accuracy requirements and usage environments. This problem is further compounded when multiple variable accuracy algorithms are nested together due to the complex way that accuracy requirements can propagate across algorithms and because of the resulting size of the set of allowable compositions. As a result, programmers often deal with this issue in an ad-hoc manner that can sometimes violate sound programming practices such as maintaining library abstractions. In this paper, we propose language extensions that expose trade-offs between time and accuracy to the compiler. The compiler performs fully automatic compile-time and install-time autotuning and analyses in order to construct optimized algorithms to achieve any given target accuracy. We present novel compiler techniques and a structured genetic tuning algorithm to search the space of candidate algorithms and accuracies in the presence of recursion and sub-calls to other variable accuracy code. These techniques benefit both the library writer, by providing an easy way to describe and search the parameter and algorithmic choice space, and the library user, by allowing high level specification of accuracy requirements which are then met automatically without the need for the user to understand any algorithm-specific parameters. Additionally, we present a new suite of benchmarks, written in our language, to examine the efficacy of our techniques. Our experimental results show that by relaxing accuracy requirements, we can easily obtain performance improvements ranging from 1.1x to orders of magnitude of speedup
Statistical and Machine Learning Techniques Applied to Algorithm Selection for Solving Sparse Linear Systems
There are many applications and problems in science and engineering that require large-scale numerical simulations and computations. The issue of choosing an appropriate method to solve these problems is very common, however it is not a trivial one, principally because this decision is most of the times too hard for humans to make, or certain degree of expertise and knowledge in the particular discipline, or in mathematics, are required. Thus, the development of a methodology that can facilitate or automate this process and helps to understand the problem, would be of great interest and help. The proposal is to utilize various statistically based machine-learning and data mining techniques to analyze and automate the process of choosing an appropriate numerical algorithm for solving a specific set of problems (sparse linear systems) based on their individual properties
Doctor of Philosophy
dissertationEmerging trends such as growing architectural diversity and increased emphasis on energy and power efficiency motivate the need for code that adapts to its execution context (input dataset and target architecture). Unfortunately, writing such code remains difficult, and is typically attempted only by a small group of motivated expert programmers who are highly knowledgeable about the relationship between software and its hardware mapping. In this dissertation, we introduce novel abstractions and techniques based on automatic performance tuning that enable both experts and nonexperts (application developers) to produce adaptive code. We present two new frameworks for adaptive programming: Nitro and Surge. Nitro enables expert programmers to specify code variants, or alternative implementations of the same computation, together with meta-information for selecting among them. It then utilizes supervised classification to select an optimal code variant at runtime based on characteristics of the execution context. Surge, on the other hand, provides a high-level nested data-parallel programming interface for application developers to specify computations. It then employs a two-level mechanism to automatically generate code variants and then tunes them using Nitro. The resulting code performs on par with or better than handcrafted reference implementations on both CPUs and GPUs. In addition to abstractions for expressing code variants, this dissertation also presents novel strategies for adaptively tuning them. First, we introduce a technique for dynamically selecting an optimal code variant at runtime based on characteristics of the input dataset. On five high-performance GPU applications, variants tuned using this strategy achieve over 93% of the performance of variants selected through exhaustive search. Next, we present a novel approach based on multitask learning to develop a code variant selection model on a target architecture from training on different source architectures. We evaluate this approach on a set of six benchmark applications and a collection of six NVIDIA GPUs from three distinct architecture generations. Finally, we implement support for combined code variant and frequency selection based on multiple objectives, including power and energy efficiency. Using this strategy, we construct a GPU sorting implementation that provides improved energy and power efficiency with less than a proportional drop in sorting throughput
- …