Search CORE

69 research outputs found

Restructuring Expression Dags for Efficient Parallelization

Author: Wilhelm Martin
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 17th International Symposium on Experimental Algorithms (SEA 2018)
Publication date: 01/01/2018
Field of study

In the field of robust geometric computation it is often necessary to make exact decisions based on inexact floating-point arithmetic. One common approach is to store the computation history in an arithmetic expression dag and to re-evaluate the expression with increasing precision until an exact decision can be made. We show that exact-decisions number types based on expression dags can be evaluated faster in practice through parallelization on multiple cores. We compare the impact of several restructuring methods for the expression dag on its running time in a parallel environment

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Power and memory optimization techniques in embedded systems design

Author: Allam Atef Khalil
Publication venue: LSU Digital Commons
Publication date: 01/01/2005
Field of study

Embedded systems incur tight constraints on power consumption and memory (which impacts size) in addition to other constraints such as weight and cost. This dissertation addresses two key factors in embedded system design, namely minimization of power consumption and memory requirement. The first part of this dissertation considers the problem of optimizing power consumption (peak power as well as average power) in high-level synthesis (HLS). The second part deals with memory usage optimization mainly targeting a restricted class of computations expressed as loops accessing large data arrays that arises in scientific computing such as the coupled cluster and configuration interaction methods in quantum chemistry. First, a mixed-integer linear programming (MILP) formulation is presented for the scheduling problem in HLS using multiple supply-voltages in order to optimize peak power as well as average power and energy consumptions. For large designs, the MILP formulation may not be suitable; therefore, a two-phase iterative linear programming formulation and a power-resource-saving heuristic are presented to solve this problem. In addition, a new heuristic that uses an adaptation of the well-known force-directed scheduling heuristic is presented for the same problem. Next, this work considers the problem of module selection simultaneously with scheduling for minimizing peak and average power consumption. Then, the problem of power consumption (peak and average) in synchronous sequential designs is addressed. A solution integrating basic retiming and multiple-voltage scheduling (MVS) is proposed and evaluated. A two-stage algorithm namely power-oriented retiming followed by a MVS technique for peak and/or average power optimization is presented. Memory optimization is addressed next. Dynamic memory usage optimization during the evaluation of a special class of interdependent large data arrays is considered. Finally, this dissertation develops a novel integer-linear programming (ILP) formulation for static memory optimization using the well-known fusion technique by encoding of legality rules for loop fusion of a special class of loops using logical constraints over binary decision variables and a highly effective approximation of memory usage

Louisiana State University

Compiling array computations for the Fresh Breeze Parallel Processor

Author: Ginzburg Igor Arkadiy
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2007
Field of study

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.Includes bibliographical references (p. 80).Fresh Breeze is a highly parallel architecture currently under development, which strives to provide high performance scientific computing with simple programmability. The architecture provides for multithreaded determinate execution with a write-once shared memory system. In particular, Fresh Breeze data structures must be constructed from directed acyclic graphs of immutable fixed-size chunks of memory, rather than laid out in a mutable linear memory. While this model is well suited for executing functional programs, the goal of this thesis is to see if conventional programs can be efficiently compiled for this novel memory system and parallelization model, focusing specifically on array-based linear algebra computations. We compile a subset of Java, targeting the Fresh Breeze instruction set. The compiler, using a static data-flow graph intermediate representation, performs analysis and transformations which reduce communication with the shared memory and identify opportunities for parallelization.by Igor Arkadiy Ginzburg.M.Eng

DSpace@MIT

Easier Parallel Programming with Provably-Efficient Runtime Schedulers

Author: Utterback Robert
Publication venue: Washington University Open Scholarship
Publication date: 15/08/2017
Field of study

Over the past decade processor manufacturers have pivoted from increasing uniprocessor performance to multicore architectures. However, utilizing this computational power has proved challenging for software developers. Many concurrency platforms and languages have emerged to address parallel programming challenges, yet writing correct and performant parallel code retains a reputation of being one of the hardest tasks a programmer can undertake. This dissertation will study how runtime scheduling systems can be used to make parallel programming easier. We address the difficulty in writing parallel data structures, automatically finding shared memory bugs, and reproducing non-deterministic synchronization bugs. Each of the systems presented depends on a novel runtime system which provides strong theoretical performance guarantees and performs well in practice

Washington University St. Louis: Open Scholarship

Scheduling Algorithms for Parallel Execution of Computer Programs

Author: Samadzadeh Farideh Ansari-jafari
Publication venue: 'Oklahoma State University Library'
Publication date: 01/07/1992
Field of study

Computer Scienc

SHAREOK repository

Dataflow Programming Paradigms for Computational Chemistry Methods

Author: Jagode Heike
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/05/2017
Field of study

The transition to multicore and heterogeneous architectures has shaped the High Performance Computing (HPC) landscape over the past decades. With the increase in scale, complexity, and heterogeneity of modern HPC platforms, one of the grim challenges for traditional programming models is to sustain the expected performance at scale. By contrast, dataflow programming models have been growing in popularity as a means to deliver a good balance between performance and portability in the post-petascale era. This work introduces dataflow programming models for computational chemistry methods, and compares different dataflow executions in terms of programmability, resource utilization, and scalability. This effort is driven by computational chemistry applications, considering that they comprise one of the driving forces of HPC. In particular, many-body methods, such as Coupled Cluster methods (CC), which are the gold standard to compute energies in quantum chemistry, are of particular interest for the applied chemistry community. On that account, the latest development for CC methods is used as the primary vehicle for this research, but our effort is not limited to CC and can be applied across other application domains. Two programming paradigms for expressing CC methods into a dataflow form, in order to make them capable of utilizing task scheduling systems, are presented. Explicit dataflow, is the programming model where the dataflow is explicitly specified by the developer, is contrasted with implicit dataflow, where a task scheduling runtime derives the dataflow. An abstract model is derived to explore the limits of the different dataflow programming paradigms

University of Tennessee, Knoxville: Trace