32 research outputs found

    The maximal utilization of processor co-allocation in multicluster systems

    Get PDF
    In systems consisting of multiple clusters of processors which employ space sharing for scheduling jobs, such as our distributed ASCI supercomputer (DAS), co-allocation, i.e., the simultaneous allocation of processors to single jobs in multiple clusters, may be required. In studies of scheduling in single clusters it has been shown that the achievable (maximal) utilization may be much less than 100%, a problem that may be aggravated in multicluster systems. In this paper we study the maximal utilization when co-allocating jobs in multicluster systems, both with analytic means (we derive exact and approximate formulas when the service-time distribution is exponential), and with simulations with synthetic workloads and with workloads derived from the logs of actual systems

    Communication-aware job placement policies for the KOALA grid scheduler

    Get PDF
    In multicluster systems, and more generally, in grids, parallel applications may require co-allocation, i.e., the simultaneous allocation of resources such as processors in multiple clusters. Although co-allocation enables the allocation of more processors than available on a single cluster, depending on the applicationsÿ¿ communication characteristics, it has the potential disadvantage of increased execution times due to relatively slow wide-area communication. In this paper, we present two job placement policies, the Cluster Minimization and the Flexible Cluster Minimization policies which take into account the wide-area communication overhead when co-allocating applications across the clusters. We have implemented these policies in our grid scheduler called KOALA in order to serve different job request types. To assess the performance of the policies, we perform experiments in a real multicluster testbed using communication-intensive parallel applications

    Context-Aware Kubernetes Scheduler for Edge-native Applications on 5G

    Get PDF
    This paper is an extension of work originally presented in SoftCOM 2019 [1]. The novelty of this work reside in its focused improvement of our scheduling algorithm towards its usage on a real 5G infrastructure. Industrial IoT applications are often designed to run in a distributed way on the devices and controller computers with strict service requirements for the nodes and the links between them. 5G, especially in concomitance with Edge Computing, will provide the desired level of connectivity for these setups and it will permit to host application run-time components in edge clouds. However, allocation of the edge cloud resources for Industrial IoT (IIoT) applications, is still commonly solved by rudimentary scheduling techniques (i.e. simple strategies based on CPU usage and device readiness, employing very few dynamic information). Orchestrators inherited from the cloud computing, like Kubernetes, are not satisfying to the requirements of the aforementioned applications and are not optimized for the diversity of devices which are often also limited in capacity. This design is especially slow in reacting to the environmental changes. In such circumstances, in order to provide a proper solution using these tools, we propose to take the physical, operational and network parameters (thus the full context of the IIoT application) into consideration, along with the software states and orchestrate the applications dynamically

    Computer Science and Technology Series : XV Argentine Congress of Computer Science. Selected papers

    Get PDF
    CACIC'09 was the fifteenth Congress in the CACIC series. It was organized by the School of Engineering of the National University of Jujuy. The Congress included 9 Workshops with 130 accepted papers, 1 main Conference, 4 invited tutorials, different meetings related with Computer Science Education (Professors, PhD students, Curricula) and an International School with 5 courses. CACIC 2009 was organized following the traditional Congress format, with 9 Workshops covering a diversity of dimensions of Computer Science Research. Each topic was supervised by a committee of three chairs of different Universities. The call for papers attracted a total of 267 submissions. An average of 2.7 review reports were collected for each paper, for a grand total of 720 review reports that involved about 300 different reviewers. A total of 130 full papers were accepted and 20 of them were selected for this book.Red de Universidades con Carreras en InformĂĄtica (RedUNCI

    Streamroller : A Unified Compilation and Synthesis System for Streaming Applications.

    Full text link
    The growing complexity of applications has increased the need for higher processing power. In the embedded domain, the convergence of audio, video, and networking on a handheld device has prompted the need for low cost, low power,and high performance implementations of these applications in the form of custom hardware. In a more mainstream domain like gaming consoles, the move towards more realism in physics simulations and graphics has forced the industry towards multicore systems. Many of the applications in these domains are streaming in nature. The key challenge is to get efficient implementations of custom hardware from these applications and map these applications efficiently onto multicore architectures. This dissertation presents a unified methodology, referred to as Streamroller, that can be applied for the problem of scheduling stream programs to multicore architectures and to the problem of automatic synthesis of custom hardware for stream applications. Firstly, a method called stream-graph modulo scheduling is presented, which maps stream programs effectively onto a multicore architecture. Many aspects of a real system, like limited memory and explicit DMAs are modeled in the scheduler. The scheduler is evaluated for a set of stream programs on IBM's Cell processor. Secondly, an automated high-level synthesis system for creating custom hardware for stream applications is presented. The template for the custom hardware is a pipeline of accelerators. The synthesis involves designing loop accelerators for individual kernels, instantiating buffers to store data passed between kernels, and linking these building blocks to form a pipeline. A unique aspect of this system is the use of multifunction accelerators, which improves cost by efficiently sharing hardware between multiple kernels. Finally, a method to improve the integer linear program formulations used in the schedulers that exploits symmetry in the solution space is presented. Symmetry-breaking constraints are added to the formulation, and the performance of the solver is evaluated.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/61662/1/kvman_1.pd

    Enabling 5G Edge Native Applications

    Get PDF

    Computer Science and Technology Series : XV Argentine Congress of Computer Science. Selected papers

    Get PDF
    CACIC'09 was the fifteenth Congress in the CACIC series. It was organized by the School of Engineering of the National University of Jujuy. The Congress included 9 Workshops with 130 accepted papers, 1 main Conference, 4 invited tutorials, different meetings related with Computer Science Education (Professors, PhD students, Curricula) and an International School with 5 courses. CACIC 2009 was organized following the traditional Congress format, with 9 Workshops covering a diversity of dimensions of Computer Science Research. Each topic was supervised by a committee of three chairs of different Universities. The call for papers attracted a total of 267 submissions. An average of 2.7 review reports were collected for each paper, for a grand total of 720 review reports that involved about 300 different reviewers. A total of 130 full papers were accepted and 20 of them were selected for this book.Red de Universidades con Carreras en InformĂĄtica (RedUNCI

    On Improving The Performance And Resource Utilization of Consolidated Virtual Machines: Measurement, Modeling, Analysis, and Prediction

    Get PDF
    This dissertation addresses the performance related issues of consolidated \emph{Virtual Machines} (VMs). \emph{Virtualization} is an important technology for the \emph{Cloud} and data centers. Essential features of a data center like the fault tolerance, high-availability, and \emph{pay-as-you-go} model of services are implemented with the help of VMs. Cloud had become one of the significant innovations over the past decade. Research has been going on the deployment of newer and diverse set of applications like the \emph{High-Performance Computing} (HPC), and parallel applications on the Cloud. The primary method to increase the server resource utilization is VM consolidation, running as many VMs as possible on a server is the key to improving the resource utilization. On the other hand, consolidating too many VMs on a server can degrade the performance of all VMs. Therefore, it is necessary to measure, analyze and find ways to predict the performance variation of consolidated VMs. This dissertation investigates the causes of performance variation of consolidated VMs; the relationship between the resource contention and consolidation performance, and ways to predict the performance variation. Experiments have been conducted with real virtualized servers without using any simulation. All the results presented here are real system data. In this dissertation, a methodology is introduced to do the experiments with a large number of tasks and VMs; it is called the \emph{Incremental Consolidation Benchmarking Method} (ICBM). The experiments have been done with different types of resource-intensive tasks, parallel workflow, and VMs. Furthermore, to experiment with a large number of VMs and collect the data; a scheduling framework is also designed and implemented. Experimental results are presented to demonstrate the efficiency of the ICBM and framework

    Analyse et optimisation des réseaux avioniques hétérogÚnes

    Get PDF
    La complexitĂ© des architectures de communication avioniques ne cesse de croĂźtre avec l’augmentation du nombre des terminaux interconnectĂ©s et l’expansion de la quantitĂ© des donnĂ©es Ă©changĂ©es. Afin de rĂ©pondre aux besoins Ă©mergents en terme de bande passante, latence et modularitĂ©, l’architecture de communication avionique actuelle consiste Ă  utiliser le rĂ©seau AFDX (Avionics Full DupleX Switched Ethernet) pour connecter les calculateurs et utiliser des bus d’entrĂ©e/sortie (par exemple le bus CAN (Controller Area Network)) pour connecter les capteurs et les actionneurs. Les rĂ©seaux ainsi formĂ©s sont connectĂ©s en utilisant des Ă©quipements d’interconnexion spĂ©cifiques, appelĂ©s RDC (Remote Data Concentrators) et standardisĂ© sous la norme ARINC655. Les RDCs sont des passerelles de communication modulaires qui sont reparties dans l’avion afin de gĂ©rer l’hĂ©tĂ©rogĂ©nĂ©itĂ© entre le rĂ©seau cƓur AFDX et les bus d’entrĂ©e/sortie. Certes, les RDCs permettent d’amĂ©liorer la modularitĂ© du systĂšme avionique et de rĂ©duire le coĂ»t de sa maintenance; mais, ces Ă©quipements sont devenus un des dĂ©fis majeurs durant la conception de l’architecture avionique afin de garantir les performances requises du systĂšme. Les implĂ©mentations existantes du RDC effectuent souvent une translation direct des trames et n’implĂ©mentent aucun mĂ©canisme de gestion de ressources. Or, une utilisation efficace des ressources est un besoin important dans le contexte avionique afin de faciliter l’évolution du systĂšme et l’ajout de nouvelles fonctions. Ainsi, l’objectif de cette thĂšse est la conception et la validation d’un RDC optimisĂ© implĂ©mentant des mĂ©canismes de gestion des ressources afin d’amĂ©liorer les performances de l’architecture de communication avionique tout en respectant les contraintes temporelles du systĂšme. Afin d’atteindre cet objectif, un RDC pour les architectures rĂ©seaux de type CAN-AFDX est conçu, intĂ©grant les fonctions suivantes: (i) groupement des trames appliquĂ© aux flux montants, i.e., flux gĂ©nĂ©rĂ©s par les capteurs et destinĂ©s Ă  l’AFDX, pour minimiser le coĂ»t des communication sur l’AFDX; (ii) la rĂ©gulation des flux descendants, i.e., flux gĂ©nĂ©rĂ©s par des terminaux AFDX et destinĂ©s aux actionneurs, pour rĂ©duire les contentions sur le bus CAN. Par ailleurs, notre RDC permet de connecter plusieurs bus CAN Ă  la fois tout en garantissant une isolation entre les flux. Par la suite, afin d’analyser l’impact de ce nouveau RDC sur les performances du systĂšme avionique, nous procĂ©dons Ă  la modĂ©lisation de l’architecture CAN-AFDX, et particuliĂšrement le RDC et ses nouvelles fonctions. Ensuite, nous introduisons une mĂ©thode d’analyse temporelle pour calculer des bornes maximales sur les dĂ©lais de bout en bout et vĂ©rifier le respect des contraintes temps-rĂ©el. Plusieurs configurations du RDC peuvent rĂ©pondre aux exigences du systĂšme avionique tout en offrant des Ă©conomies de ressources. Nous procĂ©dons donc au paramĂ©trage du RDC afin de minimiser la consommation de bande passante sur l’AFDX tout en respectant les contraintes temporelles. Ce problĂšme d’optimisation est considĂ©rĂ© comme NP-complet, et l’introduction des heuristiques adĂ©quates s’est avĂ©rĂ©e nĂ©cessaire afin de trouver la meilleure configuration possible du RDC. Enfin, les performances de ce nouveau RDC sont validĂ©es Ă  travers une architecture CAN-AFDX rĂ©aliste, avec plusieurs bus CAN et des centaines de flux Ă©changĂ©s. DiffĂ©rents niveaux d’utilisation des bus CAN ont Ă©tĂ© considĂ©rĂ©s et les rĂ©sultats obtenus ont montrĂ© l’efficacitĂ© de notre RDC Ă  amĂ©liorer la gestion des ressources du systĂšme avionique tout en respectant les contraintes temporelles de communication. En particulier, notre RDC offre une rĂ©duction de la bande passante AFDX allant jusqu’à 40% en comparaison avec le RDC actuellement utilisĂ©. ABSTRACT : The aim of my thesis is to provide a resources-efficient gateway to connect Input/Output (I/O) CAN buses to a backbone network based on AFDX technology, in modern avionics communication architectures. Currently, the Remote Data Concentrator (RDC) is the main standard for gateways in avionics; and the existing implementations do not integrate any resource management mechanism. To handle these limitations, we design an enhanced CAN-AFDX RDC integrating new functions: (i) Frame Packing (FP) allowing to reduce communication overheads with reference to the currently used "1 to 1" frame conversion strategy; (ii) Hierarchical Traffic Shaping (HTS) to reduce contention on the CAN bus. Furthermore, our proposed RDC allows the connection of multiple I/O CAN buses to AFDX while guaranteeing isolation between different criticality levels, using a software partitioning mechanism. To analyze the performance guarantees offered by our proposed RDC, we considered two metrics: the end-to-end latency and the induced AFDX bandwidth consumption. Furthermore, an optimization process was proposed to achieve an optimal configuration of our proposed RDC, i.e., minimizing the bandwidth utilization while meeting the real-time constraints of communication. Finally, the capacity of our proposed RDC to meet the emerging avionics requirements has been validated through a realistic avionics case study
    corecore