Search CORE

3,106 research outputs found

Scheduling and Tuning Kernels for High-performance on Heterogeneous Processor Systems

Author: Fang Ye
Publication venue: LSU Digital Commons
Publication date: 01/01/2016
Field of study

Accelerated parallel computing techniques using devices such as GPUs and Xeon Phis (along with CPUs) have proposed promising solutions of extending the cutting edge of high-performance computer systems. A significant performance improvement can be achieved when suitable workloads are handled by the accelerator. Traditional CPUs can handle those workloads not well suited for accelerators. Combination of multiple types of processors in a single computer system is referred to as a heterogeneous system. This dissertation addresses tuning and scheduling issues in heterogeneous systems. The first section presents work on tuning scientific workloads on three different types of processors: multi-core CPU, Xeon Phi massively parallel processor, and NVIDIA GPU; common tuning methods and platform-specific tuning techniques are presented. Then, analysis is done to demonstrate the performance characteristics of the heterogeneous system on different input data. This section of the dissertation is part of the GeauxDock project, which prototyped a few state-of-art bioinformatics algorithms, and delivered a fast molecular docking program. The second section of this work studies the performance model of the GeauxDock computing kernel. Specifically, the work presents an extraction of features from the input data set and the target systems, and then uses various regression models to calculate the perspective computation time. This helps understand why a certain processor is faster for certain sets of tasks. It also provides the essential information for scheduling on heterogeneous systems. In addition, this dissertation investigates a high-level task scheduling framework for heterogeneous processor systems in which, the pros and cons of using different heterogeneous processors can complement each other. Thus a higher performance can be achieve on heterogeneous computing systems. A new scheduling algorithm with four innovations is presented: Ranked Opportunistic Balancing (ROB), Multi-subject Ranking (MR), Multi-subject Relative Ranking (MRR), and Automatic Small Tasks Rearranging (ASTR). The new algorithm consistently outperforms previously proposed algorithms with better scheduling results, lower computational complexity, and more consistent results over a range of performance prediction errors. Finally, this work extends the heterogeneous task scheduling algorithm to handle power capping feature. It demonstrates that a power-aware scheduler significantly improves the power efficiencies and saves the energy consumption. This suggests that, in addition to performance benefits, heterogeneous systems may have certain advantages on overall power efficiency

Louisiana State University

[Activity of Institute for Computer Applications in Science and Engineering]

Author
Publication venue
Publication date
Field of study

This report summarizes research conducted at the Institute for Computer Applications in Science and Engineering in applied mathematics, fluid mechanics, and computer science

NASA Technical Reports Server

PerfVis: Pervasive Visualization in Immersive AugmentedReality for Performance Awareness

Author: Bergel Alexandre
Hess Mario
Merino Leonel
Nierstrasz Oscar
Weiskopf Daniel
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Developers are usually unaware of the impact of code changes to the performance of software systems. Although developers can analyze the performance of a system by executing, for instance, a performance test to compare the performance of two consecutive versions of the system, changing from a programming task to a testing task would disrupt the development flow. In this paper, we propose the use of a city visualization that dynamically provides developers with a pervasive view of the continuous performance of a system. We use an immersive augmented reality device (Microsoft HoloLens) to display our visualization and extend the integrated development environment on a computer screen to use the physical space. We report on technical details of the design and implementation of our visualization tool, and discuss early feedback that we collected of its usability. Our investigation explores a new visual metaphor to support the exploration and analysis of possibly very large and multidimensional performance data. Our initial result indicates that the city metaphor can be adequate to analyze dynamic performance data on a large and non-trivial software system.Comment: ICPE'19 vision, 4 pages, 2 figure, conferenc

arXiv.org e-Print Archive

Crossref

Bern Open Repository and Information System (BORIS)

MaterialVis: Material visualization tool using direct volume and surface rendering techniques

Author: Bulutay C.
Gudukbay U.
Heinig K. H.
Okuyan E.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

Cataloged from PDF version of article.Visualization of the materials is an indispensable part of their structural analysis. We developed a visualization tool for amorphous as well as crystalline structures, called Material Vis. Unlike the existing tools, Material Vis represents material structures as a volume and a surface manifold, in addition to plain atomic coordinates. Both amorphous and crystalline structures exhibit topological features as well as various defects. Material Vis provides a wide range of functionality to visualize such topological structures and crystal defects interactively. Direct volume rendering techniques are used to visualize the volumetric features of materials, such as crystal defects, which are responsible for the distinct fingerprints of a specific sample. In addition, the tool provides surface visualization to extract hidden topological features within the material. Together with the rich set of parameters and options to control the visualization, Material Vis allows users to visualize various aspects of materials very efficiently as generated by modern analytical techniques such as the Atom Probe Tomography. (C) 2014 Elsevier Inc. All rights reserved

Bilkent University Institutional Repository

Shared-Memory Parallel Maximal Clique Enumeration

Author: Das Apurba
Sanei-Mehri Seyed-Vahid
Tirthapura Srikanta
Publication venue
Publication date: 01/01/2018
Field of study

We present shared-memory parallel methods for Maximal Clique Enumeration (MCE) from a graph. MCE is a fundamental and well-studied graph analytics task, and is a widely used primitive for identifying dense structures in a graph. Due to its computationally intensive nature, parallel methods are imperative for dealing with large graphs. However, surprisingly, there do not yet exist scalable and parallel methods for MCE on a shared-memory parallel machine. In this work, we present efficient shared-memory parallel algorithms for MCE, with the following properties: (1) the parallel algorithms are provably work-efficient relative to a state-of-the-art sequential algorithm (2) the algorithms have a provably small parallel depth, showing that they can scale to a large number of processors, and (3) our implementations on a multicore machine shows a good speedup and scaling behavior with increasing number of cores, and are substantially faster than prior shared-memory parallel algorithms for MCE.Comment: 10 pages, 3 figures, proceedings of the 25th IEEE International Conference on. High Performance Computing, Data, and Analytics (HiPC), 201

arXiv.org e-Print Archive

Digital Repository @ Iowa State University (ISU)

Crossref

Position-Dependent Arrays and Their Application for High Performance Code Generation

Author: Dubach Christophe
Pizzuti Federico
Steuwer Michel
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 18/08/2019
Field of study

Edinburgh Research Explorer

Hofstadter butterflies of carbon nanotubes: Pseudofractality of the magnetoelectronic spectrum

Author: A. Bachtold
E. Thune
G. Cuniberti
Gianaurelio Cuniberti
H. Ajiki
J. van der Hoeven
J. Yi
M. S. Dresselhaus
N. Bezroukov
Norbert Nemec
P. Lambin
R. Rammal
R. Saito
S. Datta
S. Reich
T. E. Oliphant
V. Moldoveanu
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2006
Field of study

The electronic spectrum of a two-dimensional square lattice in a perpendicular magnetic field has become known as the Hofstadter butterfly [Hofstadter, Phys. Rev. B 14, 2239 (1976).]. We have calculated quasi-one-dimensional analogs of the Hofstadter butterfly for carbon nanotubes (CNTs). For the case of single-wall CNTs, it is straightforward to implement magnetic fields parallel to the tube axis by means of zone folding in the graphene reciprocal lattice. We have also studied perpendicular magnetic fields which, in contrast to the parallel case, lead to a much richer, pseudofractal spectrum. Moreover, we have investigated magnetic fields piercing double-wall CNTs and found strong signatures of interwall interaction in the resulting Hofstadter butterfly spectrum, which can be understood with the help of a minimal model. Ubiquitous to all perpendicular magnetic field spectra is the presence of cusp catastrophes at specific values of energy and magnetic field. Resolving the density of states along the tube circumference allows recognition of the snake states already predicted for nonuniform magnetic fields in the two-dimensional electron gas. An analytic model of the magnetic spectrum of electrons on a cylindrical surface is used to explain some of the results.Comment: 14 pages, 12 figures update to published versio

arXiv.org e-Print Archive

University of Regensburg Publication Server

Crossref