    STaRS: A scalable task routing approach to distributed scheduling

    La planificación de muchas tareas en entornos de millones de nodos no confiables representa un gran reto. Las plataformas de computación más conocidas normalmente confían en poder gestionar en un elemento centralizado todo el estado tanto de los nodos como de las aplicaciones. Esto limita su escalabilidad y capacidad para tolerar fallos. Un modelo descentralizado puede superar estos problemas pero, por lo que sabemos, ninguna solución propuesta hasta el momento ofrece resultados satisfactorios. En esta tesis, presentamos un modelo de planificación descentralizado con tres objetivos: que escale hasta millones de nodos, sin una pérdida de prestaciones que lo inhabilite; que tolere altas tasas de fallos; y que permita la implementación de varias políticas de planificación para diferentes situaciones. Nuestra propuesta consta de tres elementos principales: un modelo de datos genérico para representar la disponibilidad de los nodos de ejecución; un esquema de agregación que propaga esta información por una capa de red jerárquica; y un algoritmo de reexpedición que, usando la información agregada, encamina tareas hacia los nodos de ejecución más apropiados. Estos tres elementos son fácilmente extensibles para proporcionar diversas políticas de planificación. En concreto, nosotros hemos implementado cinco. Una política que simplemente asigna tareas a nodos desocupados; una política que minimiza el tiempo de finalización del trabajo global; una política que cumple con los requerimientos de fecha límite de aplicaciones tipo "saco de tareas"; una política que cumple con los requerimientos de fecha límite de aplicaciones tipo "workflow"; y una política que otorga una porción equitativa de la plataforma a cada aplicación. La escalabilidad se consigue a través del esquema de agregación, que provee de suficiente información de disponibilidad a los niveles altos de la jerarquía sin inundarlos, y el algoritmo de reexpedición, que busca nodos de ejecución en varias ramas de la jerarquía de manera concurrente. Como consecuencia, los costes de comunicación están acotados y los de asignación muestran un comportamiento casi logarítmico con el tamaño del sistema. Un millar de tareas se asignan en una red de 100.000 nodos en menos de 3,5 segundos, así que podemos plantearnos utilizar nuestro modelo incluso con tareas de tan solo unos minutos de duración. Por lo que sabemos, ningún trabajo similar ha sido probado con más de 10.000 nodos. Los fallos se gestionan con una estrategia de mejor esfuerzo. Cuando se detecta el fallo de un nodo, las tareas que estaba ejecutando son reenviadas por sus propietarios y la información de disponibilidad que gestionaba es reconstruida por sus vecinos. De esta manera, nuestro modelo es capaz de degradar sus prestaciones de manera proporcional al número de nodos fallidos y recuperar toda su funcionalidad. Para demostrarlo, hemos realizado pruebas de tasa media de fallos y de fallos catastróficos. Incluso con nodos fallando con un periodo mediano de solo 5 minutos, nuestro planificador es capaz de continuar dando servicio. Al mismo tiempo, es capaz de recuperarse del fallo de una fracción importante de los nodos, siempre que la capa de red jerárquico que sustenta el sistema pueda soportarlo. Después de comprobar que es factible implementar políticas con muy distintos objetivos usando nuestro modelo de planificación, también hemos probado sus prestaciones. Hemos comparado cada política con una versión centralizada que tiene pleno conocimiento del estado de cada nodo de ejecución. El resultado es que tienen unas prestaciones cercanas a las de una implementación centralizada, incluso en entornos de gran escala y con altas tasas de fallo

    Agent-based hierarchical approach for executing bag-of-tasks in clouds

    Numerous unrelated, independent (no inter-task communication) tasks called “bag-oftasks”(BoTs) compared with message passing applications can be highly parallelised andexecuted in any acceptable order. A common practice when executing bag-of-tasks applications(BoT) is to exploit the master-slave topology. Cloud environments offer some featuresthat facilitate executing BoT applications. One of the approaches to control cloud resourcesis to use agents that can flexibly act in a dynamic environment. Given these assumptions wedesigned a combination of these approaches, which can be classified as: a distributed, hierarchicalsolution to the issue of scalable executing of bag-of-tasks. The concept of our systemrelates to a project that is focused on processing huge quantities of data incoming from anetwork of sensors by the Internet. Our aim is to create a mechanism for processing such dataas a system which executes jobs while exploiting load balancing for cloud resources using,e.g., Eucalyptus. The idea is to create a hybrid architecture which takes advantage of somecentralized parts of the system and full distributedness in other parts. On the other handwe balance dependencies between the system components using a hierarchic master-slavestructure

    PiCo: A Domain-Specific Language for Data Analytics Pipelines

    In the world of Big Data analytics, there is a series of tools aiming at simplifying programming applications to be executed on clusters. Although each tool claims to provide better programming, data and execution models—for which only informal (and often confusing) semantics is generally provided—all share a common under- lying model, namely, the Dataflow model. Using this model as a starting point, it is possible to categorize and analyze almost all aspects about Big Data analytics tools from a high level perspective. This analysis can be considered as a first step toward a formal model to be exploited in the design of a (new) framework for Big Data analytics. By putting clear separations between all levels of abstraction (i.e., from the runtime to the user API), it is easier for a programmer or software designer to avoid mixing low level with high level aspects, as we are often used to see in state-of-the-art Big Data analytics frameworks. From the user-level perspective, we think that a clearer and simple semantics is preferable, together with a strong separation of concerns. For this reason, we use the Dataflow model as a starting point to build a programming environment with a simplified programming model implemented as a Domain-Specific Language, that is on top of a stack of layers that build a prototypical framework for Big Data analytics. The contribution of this thesis is twofold: first, we show that the proposed model is (at least) as general as existing batch and streaming frameworks (e.g., Spark, Flink, Storm, Google Dataflow), thus making it easier to understand high-level data-processing applications written in such frameworks. As result of this analysis, we provide a layered model that can represent tools and applications following the Dataflow paradigm and we show how the analyzed tools fit in each level. Second, we propose a programming environment based on such layered model in the form of a Domain-Specific Language (DSL) for processing data collections, called PiCo (Pipeline Composition). The main entity of this programming model is the Pipeline, basically a DAG-composition of processing elements. This model is intended to give the user an unique interface for both stream and batch processing, hiding completely data management and focusing only on operations, which are represented by Pipeline stages. Our DSL will be built on top of the FastFlow library, exploiting both shared and distributed parallelism, and implemented in C++11/14 with the aim of porting C++ into the Big Data world

    Characterizing Power and Energy Efficiency of Legion Data-Centric Runtime and Applications on Heterogeneous High-Performance Computing Systems

    The traditional parallel programming models require programmers to explicitly specify parallelism and data movement of the underlying parallel mechanisms. Different from the traditional computation-centric programming, Legion provides a data-centric programming model for extracting parallelism and data movement. In this chapter, we aim to characterize the power and energy consumption of running HPC applications on Legion. We run benchmark applications on compute nodes equipped with both CPU and GPU, and measure the execution time, power consumption and CPU/GPU utilization. Additionally, we test the message passing interface (MPI) version of these applications and compare the performance and power consumption of high-performance computing (HPC) applications using the computation-centric and data-centric programming models. Experimental results indicate Legion applications outperforms MPI applications on both performance and energy efficiency, i.e., Legion applications can be 9.17 times as fast as MPI applications and use only 9.2% energy. Legion effectively explores the heterogeneous architecture and runs applications tasks on GPU. As far as we know, this is the first study to understand the power and energy consumption of Legion programming and runtime infrastructure. Our findings will enable HPC system designers and operators to develop and tune the performance of data-centric HPC applications with constraints on power and energy consumption

    Moving Multimedia Simulations into the Cloud: a Cost-Effective Solution

    Researchers often demand bursts of computing power to quickly obtain the results of certain simulation activities. Multimedia communication simulations usually belong to such category. They may require several days on a generic PC to test a comprehensive set of conditions depending on the complexity of the scenario. This paper proposes to use a cloud computing framework to accelerate these simulations and, consequently, research activities, while at the same time reducing the overall costs. A practical simulation example is shown, representative of a typical simulation of H.264/AVC video communications over a wireless channel. This work shows that, by means of a commercial cloud computing provider, the gains of the proposed technique compared to more traditional solutions using dedicated computers can be significant in terms of speed and cost reductio