1,814 research outputs found
Performance evaluation of an open distributed platform for realistic traffic generation
Network researchers have dedicated a notable part of their efforts
to the area of modeling traffic and to the implementation of efficient traffic
generators. We feel that there is a strong demand for traffic generators
capable to reproduce realistic traffic patterns according to theoretical
models and at the same time with high performance. This work presents an open
distributed platform for traffic generation that we called distributed
internet traffic generator (D-ITG), capable of producing traffic (network,
transport and application layer) at packet level and of accurately replicating
appropriate stochastic processes for both inter departure time (IDT) and
packet size (PS) random variables. We implemented two different versions of
our distributed generator. In the first one, a log server is in charge of
recording the information transmitted by senders and receivers and these
communications are based either on TCP or UDP. In the other one, senders and
receivers make use of the MPI library. In this work a complete performance
comparison among the centralized version and the two distributed versions of
D-ITG is presented
Practical experience using a computational model for the design of heterogeneous distributed software
Heterogeneous cluster environments are becoming an increasing popular platform for executing parallel applications. Efficient heterogeneous programs must account for the differences inherent in such an environment. We propose the HBSP(1) model of computation as a framework for developing applications for heterogeneous clusters of workstations. The utility of the model is demonstrated through the design and analysis of the scatter and one-to-all broadcast algorithms. Extensive experimentation illustrates the benefits of using the model for heterogeneous program development. By hiding the non-uniformity of the underlying system, the HBSP(1) model provides a framework that embraces the heterogeneity of the underlying system
Efficient Multicast in Heterogeneous Networks of Workstations
This paper studies the problem of efficient multicast in heterogeneous networks of workstations (HNOWs) using a parameterized communication model [3]. This model associates a sending overhead and a receiving overhead with each node as well as a network latency parameter. The problem of finding optimal multicasts in this model is known to be NP-complete in the strong sense. Nevertheless, we show that for two different properties that arise in typical HNOWs, provably near-optimal and optimal solutions, respectively, can be found in polynomial time. Specifically, we show the following two results: When the ratios of receiving overhead to sending overhead among the nodes is bounded by constants, solutions within a bounded ratio of optimal can be found in time O(n log n). Secondly, if the number of distinct types of workstations is fixed then optimal solutions can be found in polynomial time. These results provide a practical means of finding optimal and provably near-optimal multicast schedules in a large class of frequently occurring heterogeneous networks of workstations
Analytical Modeling of High Performance Reconfigurable Computers: Prediction and Analysis of System Performance.
The use of a network of shared, heterogeneous workstations each harboring a Reconfigurable Computing (RC) system offers high performance users an inexpensive platform for a wide range of computationally demanding problems. However, effectively using the full potential of these systems can be challenging without the knowledge of the system’s performance characteristics. While some performance models exist for shared, heterogeneous workstations, none thus far account for the addition of Reconfigurable Computing systems. This dissertation develops and validates an analytic performance modeling methodology for a class of fork-join algorithms executing on a High Performance Reconfigurable Computing (HPRC) platform. The model includes the effects of the reconfigurable device, application load imbalance, background user load, basic message passing communication, and processor heterogeneity. Three fork-join class of applications, a Boolean Satisfiability Solver, a Matrix-Vector Multiplication algorithm, and an Advanced Encryption Standard algorithm are used to validate the model with homogeneous and simulated heterogeneous workstations. A synthetic load is used to validate the model under various loading conditions including simulating heterogeneity by making some workstations appear slower than others by the use of background loading. The performance modeling methodology proves to be accurate in characterizing the effects of reconfigurable devices, application load imbalance, background user load and heterogeneity for applications running on shared, homogeneous and heterogeneous HPRC resources. The model error in all cases was found to be less than five percent for application runtimes greater than thirty seconds and less than fifteen percent for runtimes less than thirty seconds. The performance modeling methodology enables us to characterize applications running on shared HPRC resources. Cost functions are used to impose system usage policies and the results of vii the modeling methodology are utilized to find the optimal (or near-optimal) set of workstations to use for a given application. The usage policies investigated include determining the computational costs for the workstations and balancing the priority of the background user load with the parallel application. The applications studied fall within the Master-Worker paradigm and are well suited for a grid computing approach. A method for using NetSolve, a grid middleware, with the model and cost functions is introduced whereby users can produce optimal workstation sets and schedules for Master-Worker applications running on shared HPRC resources
Non-Cooperative Scheduling of Multiple Bag-of-Task Applications
Multiple applications that execute concurrently on heterogeneous platforms
compete for CPU and network resources. In this paper we analyze the behavior of
non-cooperative schedulers using the optimal strategy that maximize their
efficiency while fairness is ensured at a system level ignoring applications
characteristics. We limit our study to simple single-level master-worker
platforms and to the case where each scheduler is in charge of a single
application consisting of a large number of independent tasks. The tasks of a
given application all have the same computation and communication requirements,
but these requirements can vary from one application to another. In this
context, we assume that each scheduler aims at maximizing its throughput. We
give closed-form formula of the equilibrium reached by such a system and study
its performance. We characterize the situations where this Nash equilibrium is
optimal (in the Pareto sense) and show that even though no catastrophic
situation (Braess-like paradox) can occur, such an equilibrium can be
arbitrarily bad for any classical performance measure
Meeting the challenges of decentralized embedded applications using multi-agent systems
International audienceToday embedded applications become large scale andstrongly constrained. They require a decentralized embedded intelligencegenerating challenges for embedded systems. A multi-agent approach iswell suited to model and design decentralized embedded applications.It is naturally able to take up some of these challenges. But somespecific points have to be introduced, enforced or improved in multiagentapproaches to reach all features and all requirements. In thisarticle, we present a study of specific activities that can complementmulti-agent paradigm in the ”embedded” context.We use our experiencewith the DIAMOND method to introduce and illustrate these featuresand activities
A Survey on Compiler Autotuning using Machine Learning
Since the mid-1990s, researchers have been trying to use machine-learning
based approaches to solve a number of different compiler optimization problems.
These techniques primarily enhance the quality of the obtained results and,
more importantly, make it feasible to tackle two main compiler optimization
problems: optimization selection (choosing which optimizations to apply) and
phase-ordering (choosing the order of applying optimizations). The compiler
optimization space continues to grow due to the advancement of applications,
increasing number of compiler optimizations, and new target architectures.
Generic optimization passes in compilers cannot fully leverage newly introduced
optimizations and, therefore, cannot keep up with the pace of increasing
options. This survey summarizes and classifies the recent advances in using
machine learning for the compiler optimization field, particularly on the two
major problems of (1) selecting the best optimizations and (2) the
phase-ordering of optimizations. The survey highlights the approaches taken so
far, the obtained results, the fine-grain classification among different
approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our
Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated
quarterly here (Send me your new published papers to be added in the
subsequent version) History: Received November 2016; Revised August 2017;
Revised February 2018; Accepted March 2018
- …