2,395 research outputs found
Multi-criteria scheduling of pipeline workflows
Mapping workflow applications onto parallel platforms is a challenging
problem, even for simple application patterns such as pipeline graphs. Several
antagonist criteria should be optimized, such as throughput and latency (or a
combination). In this paper, we study the complexity of the bi-criteria mapping
problem for pipeline graphs on communication homogeneous platforms. In
particular, we assess the complexity of the well-known chains-to-chains problem
for different-speed processors, which turns out to be NP-hard. We provide
several efficient polynomial bi-criteria heuristics, and their relative
performance is evaluated through extensive simulations
OBDD-Based Representation of Interval Graphs
A graph can be described by the characteristic function of the
edge set which maps a pair of binary encoded nodes to 1 iff the nodes
are adjacent. Using \emph{Ordered Binary Decision Diagrams} (OBDDs) to store
can lead to a (hopefully) compact representation. Given the OBDD as an
input, symbolic/implicit OBDD-based graph algorithms can solve optimization
problems by mainly using functional operations, e.g. quantification or binary
synthesis. While the OBDD representation size can not be small in general, it
can be provable small for special graph classes and then also lead to fast
algorithms. In this paper, we show that the OBDD size of unit interval graphs
is and the OBDD size of interval graphs is $O(\
| V \ | \log \ | V \ |)\Omega(\ | V \ | \log
\ | V \ |)O(\log \ | V \ |)O(\log^2 \ | V \ |)$ operations and
evaluate the algorithms empirically.Comment: 29 pages, accepted for 39th International Workshop on Graph-Theoretic
Concepts 201
Efficient Utilization of Fine-Grained Parallelism using a microHeterogeneous Environment
The goal of this thesis is to propose a new computing paradigm, called micro- Heterogeneous computing or mHC, which incorporates PCI (or other high speed local system bus) based processing elements (vector processors, digital signal processors, etc) into a general purpose machine. In this manner the benefits of heterogeneous computing on scientific applications can be achieved while avoiding some of the lim itations. Overall performance is increased by exploiting fine-grained parallelism on the most efficient architecture available, while reducing the high communication over head and costs of traditional heterogeneous environments. Furthermore, mHC based machines can be combined into a cluster, allowing both the coarse-grained and fine grained parallelism to be fully exploited in order to achieve even greater levels of performance. An existing high performance computing API (GSL) was chosen as the interface to the system to allow for easy integration with applications that were previously developed using this API. The ensuing chapters will provide the motivation for this work, an overview of heterogenous computing, and the details pertaining to microHeterogeneous comput ing. The framework implemented to demonstrate a microHeterogeneous computing environment will be examined as well as the results. Finally, the future of micro Heterogeneous computing will be discussed
Determining the optimal redistribution
The classical redistribution problem aims at optimally scheduling communications when moving from an initial data distribution \Dini to a target distribution \Dtar where each processor will host a subset of data items. However, modern computing platforms are equipped with a powerful interconnection switch, and the cost of a given communication is (almost) independent of the location of its sender and receiver. This leads to generalizing the redistribution problem as follows: find the optimal permutation of processors such that will host the set , and for which the cost of the redistribution is minimal. This report studies the complexity of this generalized problem. We provide optimal algorithms and evaluate their gain over classical redistribution through simulations. We also show the NP-hardness of the problem to find the optimal data partition and processor permutation (defined by new subsets ) that minimize the cost of redistribution followed by a simple computation kernel.Le problème de redistribution classique consiste à ordonnancer les communications de manière optimale lorsque l'on passe une distribution de données initiale \Dini à une distribution cible \Dtar où chaque processeur héberge un sous-ensemble des données. Cependant, les plates-formes de calcul modernes sont équipées de puissants réseaux d'interconnexion programmables, et le coût d'une communication donnée est (presque) indépendant de l'emplacement de l'expéditeur et du récepteur. Cela conduit à généraliser le problème de redistribution comme suit: trouver la permutation optimale de processeurs telle que héberge l'ensemble , et telle que le coût de redistribution soit minimal. Ce rapport étudie la complexité de ce problème généralisé. Nous proposons des algorithmes optimaux et évaluons leur gain par rapport à la redistribution classique, via quelques simulations. Nous montrons aussi la NP-completude du problème consistant à trouver la partition de données optimale et la permutation des processeurs (définie par les nouveaux sous-ensembles ) qui minimise le coût de la redistribution suivie d'un noyau de calcul simple
- …