Search CORE

1,102 research outputs found

Automated problem scheduling and reduction of synchronization delay effects

Author: Saltz Joel H.
Publication venue
Publication date
Field of study

It is anticipated that in order to make effective use of many future high performance architectures, programs will have to exhibit at least a medium grained parallelism. A framework is presented for partitioning very sparse triangular systems of linear equations that is designed to produce favorable preformance results in a wide variety of parallel architectures. Efficient methods for solving these systems are of interest because: (1) they provide a useful model problem for use in exploring heuristics for the aggregation, mapping and scheduling of relatively fine grained computations whose data dependencies are specified by directed acrylic graphs, and (2) because such efficient methods can find direct application in the development of parallel algorithms for scientific computation. Simple expressions are derived that describe how to schedule computational work with varying degrees of granularity. The Encore Multimax was used as a hardware simulator to investigate the performance effects of using the partitioning techniques presented in shared memory architectures with varying relative synchronization costs

NASA Technical Reports Server

Effects of partitioning and scheduling sparse matrix factorization on communication and load balance

Author: Naik Vijay K.
Venugopal Sesh
Publication venue
Publication date
Field of study

A block based, automatic partitioning and scheduling methodology is presented for sparse matrix factorization on distributed memory systems. Using experimental results, this technique is analyzed for communication and load imbalance overhead. To study the performance effects, these overheads were compared with those obtained from a straightforward 'wrap mapped' column assignment scheme. All experimental results were obtained using test sparse matrices from the Harwell-Boeing data set. The results show that there is a communication and load balance tradeoff. The block based method results in lower communication cost whereas the wrap mapped scheme gives better load balance

NASA Technical Reports Server

CASCH: a tool for computer-aided scheduling

Author: Ahmad I
Kwok YK
Shu W
Wu MY
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2000
Field of study

A software tool called Computer-Aided Scheduling (CASCH) for parallel processing on distributed-memory multiprocessors in a complete parallel programming environment is presented. A compiler automatically converts sequential applications into parallel codes to perform program parallelization. The parallel code that executes on a target machine is optimized by CASCH through proper scheduling and mapping.published_or_final_versio

HKU Scholars Hub

Taming Numbers and Durations in the Model Checking Integrated Planning System

Author: Edelkamp S.
Publication venue: 'AI Access Foundation'
Publication date: 30/06/2011
Field of study

The Model Checking Integrated Planning System (MIPS) is a temporal least commitment heuristic search planner based on a flexible object-oriented workbench architecture. Its design clearly separates explicit and symbolic directed exploration algorithms from the set of on-line and off-line computed estimates and associated data structures. MIPS has shown distinguished performance in the last two international planning competitions. In the last event the description language was extended from pure propositional planning to include numerical state variables, action durations, and plan quality objective functions. Plans were no longer sequences of actions but time-stamped schedules. As a participant of the fully automated track of the competition, MIPS has proven to be a general system; in each track and every benchmark domain it efficiently computed plans of remarkable quality. This article introduces and analyzes the most important algorithmic novelties that were necessary to tackle the new layers of expressiveness in the benchmark problems and to achieve a high level of performance. The extensions include critical path analysis of sequentially generated plans to generate corresponding optimal parallel plans. The linear time algorithm to compute the parallel plan bypasses known NP hardness results for partial ordering by scheduling plans with respect to the set of actions and the imposed precedence relations. The efficiency of this algorithm also allows us to improve the exploration guidance: for each encountered planning state the corresponding approximate sequential plan is scheduled. One major strength of MIPS is its static analysis phase that grounds and simplifies parameterized predicates, functions and operators, that infers knowledge to minimize the state description length, and that detects domain object symmetries. The latter aspect is analyzed in detail. MIPS has been developed to serve as a complete and optimal state space planner, with admissible estimates, exploration engines and branching cuts. In the competition version, however, certain performance compromises had to be made, including floating point arithmetic, weighted heuristic search exploration according to an inadmissible estimate and parameterized optimization

arXiv.org e-Print Archive

Crossref

Dagstuhl Reports : Volume 1, Issue 2, February 2011

Author: Schloss Dagstuhl Leibniz-Zentrum für Informatik
Publication venue
Publication date: 09/09/2011
Field of study

Online Privacy: Towards Informational Self-Determination on the Internet (Dagstuhl Perspectives Workshop 11061) : Simone Fischer-Hübner, Chris Hoofnagle, Kai Rannenberg, Michael Waidner, Ioannis Krontiris and Michael Marhöfer Self-Repairing Programs (Dagstuhl Seminar 11062) : Mauro Pezzé, Martin C. Rinard, Westley Weimer and Andreas Zeller Theory and Applications of Graph Searching Problems (Dagstuhl Seminar 11071) : Fedor V. Fomin, Pierre Fraigniaud, Stephan Kreutzer and Dimitrios M. Thilikos Combinatorial and Algorithmic Aspects of Sequence Processing (Dagstuhl Seminar 11081) : Maxime Crochemore, Lila Kari, Mehryar Mohri and Dirk Nowotka Packing and Scheduling Algorithms for Information and Communication Services (Dagstuhl Seminar 11091) Klaus Jansen, Claire Mathieu, Hadas Shachnai and Neal E. Youn

Hochschulschriftenserver - Universität Frankfurt am Main

Modeling and Mapping of Optimized Schedules for Embedded Signal Processing Systems

Author: Wu Hsiang-Huang
Publication venue
Publication date: 01/01/2013
Field of study

The demand for Digital Signal Processing (DSP) in embedded systems has been increasing rapidly due to the proliferation of multimedia- and communication-intensive devices such as pervasive tablets and smart phones. Efficient implementation of embedded DSP systems requires integration of diverse hardware and software components, as well as dynamic workload distribution across heterogeneous computational resources. The former implies increased complexity of application modeling and analysis, but also brings enhanced potential for achieving improved energy consumption, cost or performance. The latter results from the increased use of dynamic behavior in embedded DSP applications. Furthermore, parallel programming is highly relevant in many embedded DSP areas due to the development and use of Multiprocessor System-On-Chip (MPSoC) technology. The need for efficient cooperation among different devices supporting diverse parallel embedded computations motivates high-level modeling that expresses dynamic signal processing behaviors and supports efficient task scheduling and hardware mapping. Starting with dynamic modeling, this thesis develops a systematic design methodology that supports functional simulation and hardware mapping of dynamic reconfiguration based on Parameterized Synchronous Dataflow (PSDF) graphs. By building on the DIF (Dataflow Interchange Format), which is a design language and associated software package for developing and experimenting with dataflow-based design techniques for signal processing systems, we have developed a novel tool for functional simulation of PSDF specifications. This simulation tool allows designers to model applications in PSDF and simulate their functionality, including use of the dynamic parameter reconfiguration capabilities offered by PSDF. With the help of this simulation tool, our design methodology helps to map PSDF specifications into efficient implementations on field programmable gate arrays (FPGAs). Furthermore, valid schedules can be derived from the PSDF models at runtime to adapt hardware configurations based on changing data characteristics or operational requirements. Under certain conditions, efficient quasi-static schedules can be applied to reduce overhead and enhance predictability in the scheduling process. Motivated by the fact that scheduling is critical to performance and to efficient use of dynamic reconfiguration, we have focused on a methodology for schedule design, which complements the emphasis on automated schedule construction in the existing literature on dataflow-based design and implementation. In particular, we have proposed a dataflow-based schedule design framework called the dataflow schedule graph (DSG), which provides a graphical framework for schedule construction based on dataflow semantics, and can also be used as an intermediate representation target for automated schedule generation. Our approach to applying the DSG in this thesis emphasizes schedule construction as a design process rather than an outcome of the synthesis process. Our approach employs dataflow graphs for representing both application models and schedules that are derived from them. By providing a dataflow-integrated framework for unambiguously representing, analyzing, manipulating, and interchanging schedules, the DSG facilitates effective codesign of dataflow-based application models and schedules for execution of these models. As multicore processors are deployed in an increasing variety of embedded image processing systems, effective utilization of resources such as multiprocessor systemon-chip (MPSoC) devices, and effective handling of implementation concerns such as memory management and I/O become critical to developing efficient embedded implementations. However, the diversity and complexity of applications and architectures in embedded image processing systems make the mapping of applications onto MPSoCs difficult. We help to address this challenge through a structured design methodology that is built upon the DSG modeling framework. We refer to this methodology as the DEIPS methodology (DSG-based design and implementation of Embedded Image Processing Systems). The DEIPS methodology provides a unified framework for joint consideration of DSG structures and the application graphs from which they are derived, which allows designers to integrate considerations of parallelization and resource constraints together with the application modeling process. We demonstrate the DEIPS methodology through cases studies on practical embedded image processing systems

CiteSeerX

Digital Repository at the University of Maryland

Parameterized Rural Postman Problem

Author: Gutin Gregory
Wahlstrom Magnus
Yeo Anders
Publication venue
Publication date: 31/03/2014
Field of study

The Directed Rural Postman Problem (DRPP) can be formulated as follows: given a strongly connected directed multigraph

D=(V,A)

with nonnegative integral weights on the arcs, a subset

R

A

and a nonnegative integer

\ell

, decide whether

D

has a closed directed walk containing every arc of

R

and of total weight at most

\ell

. Let

k

be the number of weakly connected components in the the subgraph of

D

induced by

R

. Sorge et al. (2012) ask whether the DRPP is fixed-parameter tractable (FPT) when parameterized by

k

, i.e., whether there is an algorithm of running time

O^*(f(k))

where

f

is a function of

k

only and the

O^*

notation suppresses polynomial factors. Sorge et al. (2012) note that this question is of significant practical relevance and has been open for more than thirty years. Using an algebraic approach, we prove that DRPP has a randomized algorithm of running time

O^*(2^k)

when

\ell

is bounded by a polynomial in the number of vertices in

D

. We also show that the same result holds for the undirected version of DRPP, where

D

is a connected undirected multigraph

arXiv.org e-Print Archive

CiteSeerX

Recognizing finite repetitive scheduling patterns in manufacturing systems

Author: Hendriks M.
Nieuwelaar van den, N.J.M.
Vaandrager F.W.
Publication venue
Publication date: 01/01/2003
Field of study

Optimization of timing behaviour of manufacturing systems can be regardedas a scheduling problem in which tasks model the various productionprocesses. Typical for many manufacturing systems is that (collectionsof) tasks can be associated with manufacturing entities, which canbe structured hierarchically. Execution of production processes for severalinstances of these entities results in nested finite repetitions, whichblows up the size of the task graph that is needed for the specification ofthe scheduling problem, and, in an even worse way, the number of possibleschedules. We present a subclass of UML activity diagrams whichis generic for the number of repetitions, and therefore suitable for thecompact specification of task graphs for these manufacturing systems.The approach to reduce the complexity of the scheduling problem exploitsthe repetitive patterns. It reduces the original problem to a problemcontaining the minimum amount of identical repetitions, and afterscheduling of this much smaller problem the schedule is expanded tothe original size. We demonstrate our technique on a real-life examplefrom the semiconductor industry

Repository TU/e

Pure OAI Repository