Search CORE

16 research outputs found

Planning as Optimization: Dynamically Discovering Optimal Configurations for Runtime Situations

Author: Fredericks Erik M.
Gerostathopoulos Ilias
Krupitzer Christian
Vogel Thomas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/05/2019
Field of study

The large number of possible configurations of modern software-based systems, combined with the large number of possible environmental situations of such systems, prohibits enumerating all adaptation options at design time and necessitates planning at run time to dynamically identify an appropriate configuration for a situation. While numerous planning techniques exist, they typically assume a detailed state-based model of the system and that the situations that warrant adaptations are known. Both of these assumptions can be violated in complex, real-world systems. As a result, adaptation planning must rely on simple models that capture what can be changed (input parameters) and observed in the system and environment (output and context parameters). We therefore propose planning as optimization: the use of optimization strategies to discover optimal system configurations at runtime for each distinct situation that is also dynamically identified at runtime. We apply our approach to CrowdNav, an open-source traffic routing system with the characteristics of a real-world system. We identify situations via clustering and conduct an empirical study that compares Bayesian optimization and two types of evolutionary optimization (NSGA-II and novelty search) in CrowdNav

arXiv.org e-Print Archive

Crossref

Implementation of D-Spline-Based Incremental Performance Parameter Estimation Method with ppOpen-AT

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

Crossref

Evolutionary Auto-Tuning for Multicore Applications

Author: Pankratius Victor
Zwinkau Andreas
Publication venue: Karlsruher Institut für Technologie
Publication date: 01/01/2011
Field of study

KITopen

Trends of Software R&D for Numerical Simulation - Hardware for parallel and distributed computing and software automatic tuning -

Author: Science & Technology Foresight Center
古川貴雄
野村稔
Publication venue: Science & Technology Foresight Center（NISTEP)
Publication date: 01/01/2010
Field of study

National Institute of Science and Technology Policy Library (NISTEP) / 科学技術・学術政策研究所ライブラリ

A Benchmark Set of Highly-efficient CUDA and OpenCL Kernels and its Dynamic Autotuning with Kernel Tuning Toolkit

Author: Benkner Siegfried
Filipovič Jiří
Hozzová Jana
Oľha Jaroslav
Petrovič Filip
Střelák David
Trembecký Richard
Publication venue: 'Elsevier BV'
Publication date: 28/02/2020
Field of study

Autotuning of performance-relevant source-code parameters allows to automatically tune applications without hard coding optimizations and thus helps with keeping the performance portable. In this paper, we introduce a benchmark set of ten autotunable kernels for important computational problems implemented in OpenCL or CUDA. Using our Kernel Tuning Toolkit, we show that with autotuning most of the kernels reach near-peak performance on various GPUs and outperform baseline implementations on CPUs and Xeon Phis. Our evaluation also demonstrates that autotuning is key to performance portability. In addition to offline tuning, we also introduce dynamic autotuning of code optimization parameters during application runtime. With dynamic tuning, the Kernel Tuning Toolkit enables applications to re-tune performance-critical kernels at runtime whenever needed, for example, when input data changes. Although it is generally believed that autotuning spaces tend to be too large to be searched during application runtime, we show that it is not necessarily the case when tuning spaces are designed rationally. Many of our kernels reach near peak-performance with moderately sized tuning spaces that can be searched at runtime with acceptable overhead. Finally we demonstrate, how dynamic performance tuning can be integrated into a real-world application from cryo-electron microscopy domain

arXiv.org e-Print Archive

Digital.CSIC

Compiler and Runtime Optimization Techniques for Implementation Scalable Parallel Applications

Author: Khatami Zahra
Publication venue: LSU Digital Commons
Publication date: 03/08/2017
Field of study

The compiler is able to detect the data dependencies in an application and is able to analyze the specific sections of code for parallelization potential. However, all of these techniques provided by a compiler are usually applied at compile time, so they rely on static analysis, which is insufficient for achieving maximum parallelism and desired application scalability. These compiler techniques should consider both the static information gathered at compile time and dynamic analysis captured at runtime about the system to generate a safe parallel application. On the other hand, runtime information is often speculative. Solely relying on it doesn\u27t guarantee maximal parallel performance. So collecting information at compile time could significantly improve the runtime techniques performance. The goal is achieved in this research by introducing new techniques proposed for both compiler and runtime system that enable them to contribute with each other and utilize both static and dynamic analysis information to maximize application parallel performance. In the proposed framework, a compiler can implement dynamic runtime methods in its parallelization optimizations and a runtime system can apply static information in its parallelization methods implementation. The proposed techniques are able to use high-level programming abstractions and machine learning to relieve the programmer of difficult and tedious decisions that can significantly affect program behavior and performance

Louisiana State University

数値シミュレーションにおけるソフトウェア研究開発の動向 ─並列分散型のハードウェアとソフトウェア自動チューニング─

Author: 古川貴雄
科学技術動向研究センター
野村稔
Publication venue: 科学技術政策研究所科学技術動向研究センター
Publication date: 01/01/2009
Field of study

National Institute of Science and Technology Policy Library (NISTEP) / 科学技術・学術政策研究所ライブラリ

キャッシュ機構を有するベクトルアーキテクチャのためのプログラム最適化戦略に関する研究

Author: 佐藤義永
Publication venue
Publication date: 27/03/2012
Field of study

Tohoku University小林広明課

Tohoku University Repository (TOUR) / 東北大学機関リポジトリ

Institutional Repositories DataBase (IRDB)

Recommended from our members

Efficient Generation of Sequences of Dense Linear Algebra through Auto-Tuning

Author: Belter Geoffrey
Publication venue: CU Scholar
Publication date: 01/01/2012
Field of study

It is rare for a programmer to solve a numerical problem with a single library call; most problems require a sequence of calls. In the case of linear algebra, programmers will chain a series of Basic Linear Algebra Subprogram (BLAS) library calls to achieve the desired result. When a sequence of BLAS calls is memory bound, a great deal of performance is missed because optimization has not occurred between library routines. It is not practical to create a library with every required sequence of linear algebra operations, but at the same time it is difficult for programmers to write their own high performance implementation. One solution is for programmers to use an auto-tuning tool capable of optimizing the sequence of operations that exactly suits their need. This thesis presents a matrix representation and type system that describes basic linear algebra operations, the loops required to implement those operations, and the legality of key optimizations. This is demonstrated in an auto-tuning tool which generates loops and performs data parallelism and loop fusion. Results show that this approach can match or exceed performance of vendor tuned BLAS libraries, general purpose optimizing compilers, and hand written code. Further, this approach is shown to be both portable and work with a range of dense matrix storage formats. All of this is achieved with search times in the range of several minutes to a few hours

CU Scholar Institutional Repository

OPTIMIZATION OF ALGORITHMS WITH THE OPAL FRAMEWORK

Author: Dang Cong Kein
Publication venue
Publication date: 01/06/2012
Field of study

RÉSUMÉ La question d'identifier de bons paramètres a été étudiée depuis longtemps et on peut compter un grand nombre de recherches qui se concentrent sur ce sujet. Certaines de ces recherches manquent de généralité et surtout de re-utilisabilité. Une première raison est que ces projets visent des systèmes spécifiques. En plus, la plupart de ces projets ne se concentrent pas sur les questions fondamentales de l'identification de bons paramètres. Et enfin, il n'y avait pas un outil puissant capable de surmonter des difficulté dans ce domaine. En conséquence, malgré un grand nombre de projets, les utilisateurs n'ont pas trop de possibilité à appliquer les résultats antérieurs à leurs problèmes. Cette thèse propose le cadre OPAL pour identifier de bons paramètres algorithmiques avec des éléments essentiels, indispensables. Les étapes de l'élaboration du cadre de travail ainsi que les résultats principaux sont présentés dans trois articles correspondant aux trois chapitres 4, 5 et 6 de la thèse. Le premier article introduit le cadre par l'intermédiaire d'exemples fondamentaux. En outre, dans ce cadre, la question d'identifier de bons paramètres est modélisée comme un problème d'optimisation non-lisse qui est ensuite résolu par un algorithme de recherche directe sur treillis adaptatifs. Cela réduit l'effort des utilisateurs pour accomplir la tâche d'identifier de bons paramètres. Le deuxième article décrit une extension visant à améliorer la performance du cadre OPAL. L'utilisation efficace de ressources informatiques dans ce cadre se fait par l'étude de plusieurs stratégies d'utilisation du parallélisme et par l'intermédiaire d'une fonctionnalité particulière appelée l'interruption des tâches inutiles. Le troisième article est une description complète du cadre et de son implémentation en Python. En plus de rappeler les caractéristiques principales présentées dans des travaux antérieurs, l'intégration est présentée comme une nouvelle fonctionnalité par une démonstration de la coopération avec un outil de classification. Plus précisément, le travail illustre une coopération de OPAL et un outil de classification pour résoudre un problème d'optimisation des paramètres dont l'ensemble de problèmes tests est trop grand et une seule évaluation peut prendre une journée.----------ABSTRACT The task of parameter tuning question has been around for a long time, spread over most domains and there have been many attempts to address it. Research on this question often lacks in generality and re-utilisability. A first reason is that these projects aim at specific systems. Moreover, some approaches do not concentrate on the fundamental questions of parameter tuning. And finally, there was not a powerful tool that is able to take over the difficulties in this domain. As a result, the number of projects continues to grow, while users are not able to apply the previous achievements to their own problem. The present work systematically approaches parameter tuning by figuring out the fundamental issues and identifying the basic elements for a general system. This provides the base for developing a general and flexible framework called OPAL, which stands for OPtimization of ALgorithms. The milestones in developing the framework as well as the main achievements are presented through three papers corresponding to the three chapters 4, 5 and 6 of this thesis. The first paper introduces the framework by describing the crucial basic elements through some very simple examples. To this end, the paper considers three questions in constructing an automated parameter tuning framework. By answering these questions, we propose OPAL, consisting of indispensable components of a parameter tuning framework. OPAL models the parameter tuning task as a blackbox optimization problem. This reduces the effort of users in launching a tuning session. The second paper shows one of the opportunities to extend the framework. To take advantage of the situations where multiple processors are available, we study various ways of embedding parallelism and develop a feature called ''interruption of unnecessary tasks'' in order to improve performance of the framework. The third paper is a full description of the framework and a release of its Python} implementation. In addition to the confirmations on the methodology and the main features presented in previous works, the integrability is introduced as a new feature of this release through an example of the cooperation with a classification tool. More specifically, the work illustrates a cooperation of OPAL and a classification tool to solve a parameter optimization problem of which the test problem set is too large and an assessment can take a day

PolyPublie