2,986 research outputs found
Executing Bag of Distributed Tasks on the Cloud: Investigating the Trade-offs Between Performance and Cost
Bag of Distributed Tasks (BoDT) can benefit from decentralised execution on
the Cloud. However, there is a trade-off between the performance that can be
achieved by employing a large number of Cloud VMs for the tasks and the
monetary constraints that are often placed by a user. The research reported in
this paper is motivated towards investigating this trade-off so that an optimal
plan for deploying BoDT applications on the cloud can be generated. A heuristic
algorithm, which considers the user's preference of performance and cost is
proposed and implemented. The feasibility of the algorithm is demonstrated by
generating execution plans for a sample application. The key result is that the
algorithm generates optimal execution plans for the application over 91\% of
the time
Energy-aware scheduling of bag-of-tasks applications on master-worker platforms
International audienceWe consider the problem of scheduling an application composed of independent tasks on a fully heterogeneous master-worker platform with communication costs. We introduce a bi-criteria approach aiming at maximizing the throughput of the application while minimizing the energy consumed by participating resources. Assuming arbitrary super-linear power consumption laws, we investigate different models, with energy overheads and memory constraints. Building upon closed-form expressions for the uni-processor case, we derive asymptotically optimal solutions for all models
Energy-aware scheduling in heterogeneous computing systems
In the last decade, the grid computing systems emerged as useful provider of the computing power required for solving complex problems.
The classic formulation of the scheduling problem in heterogeneous computing systems is NP-hard, thus approximation techniques are required for solving real-world scenarios of this problem. This thesis tackles the
problem of scheduling tasks in a heterogeneous computing environment in reduced execution times, considering the schedule length and the total energy consumption as the optimization objectives. An efficient multithreading local search algorithm for solving the multi-objective scheduling problem in heterogeneous computing systems, named MEMLS, is presented. The proposed method follows a fully multi-objective approach, applying a Pareto-based dominance search that is executed in parallel by using several threads. The experimental analysis demonstrates that the new multithreading algorithm outperforms a set of fast and accurate two-phase deterministic heuristics based on the traditional MinMin. The new ME-MLS method is able to achieve significant improvements in both makespan and energy consumption objectives in reduced execution times for a large set of testbed instances, while exhibiting very good scalability. The ME-MLS was evaluated solving instances
comprised of up to 2048 tasks and 64 machines. In order to scale the dimension of the problem instances even further and tackle large-sized problem instances, the Graphical Processing Unit (GPU) architecture is considered. This line of future work has been initially tackled with the gPALS: a hybrid CPU/GPU local search algorithm for
efficiently tackling a single-objective heterogeneous computing scheduling problem. The gPALS shows very promising results, being able to tackle instances of up to 32768 tasks and 1024 machines in reasonable
execution times.En la última década, los sistemas de computación grid se han convertido en útiles proveedores de la capacidad de cálculo necesaria para la resolución de problemas complejos. En su formulación clásica, el problema de
la planificación de tareas en sistemas heterogéneos es un problema NP difícil, por lo que se requieren técnicas de resolución aproximadas para atacar instancias de tamaño realista de este problema. Esta tesis aborda
el problema de la planificación de tareas en sistemas heterogéneos, considerando el largo de la planificación y el consumo energético como objetivos a optimizar. Para la resolución de este problema se propone un algoritmo de búsqueda local eficiente y multihilo. El método propuesto se trata de un enfoque plenamente multiobjetivo que consiste en la aplicación de una búsqueda basada en dominancia de Pareto que se ejecuta en paralelo mediante el uso de varios hilos de ejecución. El análisis experimental demuestra que el algoritmo multithilado propuesto supera a un conjunto de heurísticas deterministas rápidas y e caces basadas en el algoritmo MinMin tradicional. El nuevo método, ME-MLS, es capaz de lograr mejoras significativas tanto en el largo de la planificación y
como en consumo energético, en tiempos de ejecución reducidos para un gran número de casos de prueba, mientras que exhibe una escalabilidad muy promisoria. El ME-MLS fue evaluado abordando instancias de
hasta 2048 tareas y 64 máquinas. Con el n de aumentar la dimensión de las instancias abordadas y hacer frente a instancias de gran tamaño, se consideró la utilización de la arquitectura provista por las unidades de procesamiento gráfico (GPU). Esta línea de trabajo futuro ha sido abordada inicialmente con el algoritmo gPALS: un algoritmo híbrido CPU/GPU de búsqueda local para la planificación de tareas en en sistemas
heterogéneos considerando el largo de la planificación como único objetivo. La evaluación del algoritmo gPALS ha mostrado resultados muy prometedores, siendo capaz de abordar instancias de hasta 32768
tareas y 1024 máquinas en tiempos de ejecución razonables
A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing
Data Grids have been adopted as the platform for scientific communities that
need to share, access, transport, process and manage large data collections
distributed worldwide. They combine high-end computing technologies with
high-performance networking and wide-area storage management techniques. In
this paper, we discuss the key concepts behind Data Grids and compare them with
other data sharing and distribution paradigms such as content delivery
networks, peer-to-peer networks and distributed databases. We then provide
comprehensive taxonomies that cover various aspects of architecture, data
transportation, data replication and resource allocation and scheduling.
Finally, we map the proposed taxonomy to various Data Grid systems not only to
validate the taxonomy but also to identify areas for future exploration.
Through this taxonomy, we aim to categorise existing systems to better
understand their goals and their methodology. This would help evaluate their
applicability for solving similar problems. This taxonomy also provides a "gap
analysis" of this area through which researchers can potentially identify new
issues for investigation. Finally, we hope that the proposed taxonomy and
mapping also helps to provide an easy way for new practitioners to understand
this complex area of research.Comment: 46 pages, 16 figures, Technical Repor
Approximation Algorithms for Energy Minimization in Cloud Service Allocation under Reliability Constraints
We consider allocation problems that arise in the context of service
allocation in Clouds. More specifically, we assume on the one part that each
computing resource is associated to a capacity constraint, that can be chosen
using Dynamic Voltage and Frequency Scaling (DVFS) method, and to a probability
of failure. On the other hand, we assume that the service runs as a set of
independent instances of identical Virtual Machines. Moreover, there exists a
Service Level Agreement (SLA) between the Cloud provider and the client that
can be expressed as follows: the client comes with a minimal number of service
instances which must be alive at the end of the day, and the Cloud provider
offers a list of pairs (price,compensation), this compensation being paid by
the Cloud provider if it fails to keep alive the required number of services.
On the Cloud provider side, each pair corresponds actually to a guaranteed
success probability of fulfilling the constraint on the minimal number of
instances. In this context, given a minimal number of instances and a
probability of success, the question for the Cloud provider is to find the
number of necessary resources, their clock frequency and an allocation of the
instances (possibly using replication) onto machines. This solution should
satisfy all types of constraints during a given time period while minimizing
the energy consumption of used resources. We consider two energy consumption
models based on DVFS techniques, where the clock frequency of physical
resources can be changed. For each allocation problem and each energy model, we
prove deterministic approximation ratios on the consumed energy for algorithms
that provide guaranteed probability failures, as well as an efficient
heuristic, whose energy ratio is not guaranteed
- …