148 research outputs found

    Topology-aware GPU scheduling for learning workloads in cloud environments

    Get PDF
    Recent advances in hardware, such as systems with multiple GPUs and their availability in the cloud, are enabling deep learning in various domains including health care, autonomous vehicles, and Internet of Things. Multi-GPU systems exhibit complex connectivity among GPUs and between GPUs and CPUs. Workload schedulers must consider hardware topology and workload communication requirements in order to allocate CPU and GPU resources for optimal execution time and improved utilization in shared cloud environments. This paper presents a new topology-aware workload placement strategy to schedule deep learning jobs on multi-GPU systems. The placement strategy is evaluated with a prototype on a Power8 machine with Tesla P100 cards, showing speedups of up to ≈1.30x compared to state-of-the-art strategies; the proposed algorithm achieves this result by allocating GPUs that satisfy workload requirements while preventing interference. Additionally, a large-scale simulation shows that the proposed strategy provides higher resource utilization and performance in cloud systems.This project is supported by the IBM/BSC Technology Center for Supercomputing collaboration agreement. It has also received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 639595). It is also partially supported by the Ministry of Economy of Spain under contract TIN2015-65316-P and Generalitat de Catalunya under contract 2014SGR1051, by the ICREA Academia program, and by the BSC-CNS Severo Ochoa program (SEV-2015-0493). We thank our IBM Research colleagues Alaa Youssef and Asser Tantawi for the valuable discussions. We also thank SC17 committee member Blair Bethwaite of Monash University for his constructive feedback on the earlier drafts of this paper.Peer ReviewedPostprint (published version

    Tree-Encoded Bitmaps

    Get PDF
    We propose a novel method to represent compressed bitmaps. Similarly to existing bitmap compression schemes, we exploit the compression potential of bitmaps populated with consecutive identical bits, i.e., 0-runs and 1-runs. But in contrast to prior work, our approach employs a binary tree structure to represent runs of various lengths. Leaf nodes in the upper tree levels thereby represent longer runs, and vice versa. The tree-based representation results in high compression ratios and enables efficient random access, which in turn allows for the fast intersection of bitmaps. Our experimental analysis with randomly generated bitmaps shows that our approach significantly improves over state-of-the-art compression techniques when bitmaps are dense and/or only barely clustered. Further, we evaluate our approach with real-world data sets, showing that our tree-encoded bitmaps can save up to one third of the space over existing techniques

    Transactional and analytical data management on persistent memory

    Get PDF
    Die zunehmende Anzahl von Smart-GerĂ€ten und Sensoren, aber auch die sozialen Medien lassen das Datenvolumen und damit die geforderte Verarbeitungsgeschwindigkeit stetig wachsen. Gleichzeitig mĂŒssen viele Anwendungen Daten persistent speichern oder sogar strenge Transaktionsgarantien einhalten. Die neuartige Speichertechnologie Persistent Memory (PMem) mit ihren einzigartigen Eigenschaften scheint ein natĂŒrlicher AnwĂ€rter zu sein, um diesen Anforderungen effizient nachzukommen. Sie ist im Vergleich zu DRAM skalierbarer, gĂŒnstiger und dauerhaft. Im Gegensatz zu Disks ist sie deutlich schneller und direkt adressierbar. Daher wird in dieser Dissertation der gezielte Einsatz von PMem untersucht, um den Anforderungen moderner Anwendung gerecht zu werden. Nach der Darlegung der grundlegenden Arbeitsweise von und mit PMem, konzentrieren wir uns primĂ€r auf drei Aspekte der Datenverwaltung. ZunĂ€chst zerlegen wir mehrere persistente Daten- und Indexstrukturen in ihre zugrundeliegenden Entwurfsprimitive, um AbwĂ€gungen fĂŒr verschiedene Zugriffsmuster aufzuzeigen. So können wir ihre besten AnwendungsfĂ€lle und Schwachstellen, aber auch allgemeine Erkenntnisse ĂŒber das Entwerfen von PMem-basierten Datenstrukturen ermitteln. Zweitens schlagen wir zwei Speicherlayouts vor, die auf analytische Arbeitslasten abzielen und eine effiziente AbfrageausfĂŒhrung auf beliebigen Attributen ermöglichen. WĂ€hrend der erste Ansatz eine verknĂŒpfte Liste von mehrdimensionalen gruppierten Blöcken verwendet, handelt es sich beim zweiten Ansatz um einen mehrdimensionalen Index, der Knoten im DRAM zwischenspeichert. Drittens zeigen wir unter Verwendung der bisherigen Datenstrukturen und Erkenntnisse, wie Datenstrom- und Ereignisverarbeitungssysteme mit transaktionaler Zustandsverwaltung verbessert werden können. Dabei schlagen wir ein neuartiges Transactional Stream Processing (TSP) Modell mit geeigneten Konsistenz- und NebenlĂ€ufigkeitsprotokollen vor, die an PMem angepasst sind. Zusammen sollen die diskutierten Aspekte eine Grundlage fĂŒr die Entwicklung noch ausgereifterer PMem-fĂ€higer Systeme bilden. Gleichzeitig zeigen sie, wie Datenverwaltungsaufgaben PMem ausnutzen können, indem sie neue Anwendungsgebiete erschließen, die Leistung, Skalierbarkeit und Wiederherstellungsgarantien verbessern, die CodekomplexitĂ€t vereinfachen sowie die ökonomischen und ökologischen Kosten reduzieren.The increasing number of smart devices and sensors, but also social media are causing the volume of data and thus the demanded processing speed to grow steadily. At the same time, many applications need to store data persistently or even comply with strict transactional guarantees. The novel storage technology Persistent Memory (PMem), with its unique properties, seems to be a natural candidate to meet these requirements efficiently. Compared to DRAM, it is more scalable, less expensive, and durable. In contrast to disks, it is significantly faster and directly addressable. Therefore, this dissertation investigates the deliberate employment of PMem to fit the needs of modern applications. After presenting the fundamental work of and with PMem, we focus primarily on three aspects of data management. First, we disassemble several persistent data and index structures into their underlying design primitives to reveal the trade-offs for various access patterns. It allows us to identify their best use cases and vulnerabilities but also to gain general insights into the design of PMem-based data structures. Second, we propose two storage layouts that target analytical workloads and enable an efficient query execution on arbitrary attributes. While the first approach employs a linked list of multi-dimensional clustered blocks that potentially span several storage layers, the second approach is a multi-dimensional index that caches nodes in DRAM. Third, we show how to improve stream and event processing systems involving transactional state management using the preceding data structures and insights. In this context, we propose a novel Transactional Stream Processing (TSP) model with appropriate consistency and concurrency protocols adapted to PMem. Together, the discussed aspects are intended to provide a foundation for developing even more sophisticated PMemenabled systems. At the same time, they show how data management tasks can take advantage of PMem by opening up new application domains, improving performance, scalability, and recovery guarantees, simplifying code complexity, plus reducing economic and environmental costs

    Parallel and Distributed Processing in the Context of Fog Computing: High Throughput Pattern Matching and Distributed Monitoring

    Get PDF
    With the introduction of the Internet of Things (IoT), physical objects now have cyber counterparts that create and communicate data. Extracting valuable information from that data requires timely and accurate processing, which calls for more efficient, distributed approaches. In order to address this challenge, the fog computing approach has been suggested as an extension to cloud processing. Fog builds on the opportunity to distribute computation to a wider range of possible platforms: data processing can happen at high-end servers in the cloud, at intermediate nodes where the data is aggregated, as well as at the resource-constrained devices that produce the data in the first place.In this work, we focus on efficient utilization of the diverse hardware resources found in the fog and identify and address challenges in computation and communication. To this end, we target two applications that are representative examples of the processing involved across a wide spectrum of computing platforms. First, we address the need for high throughput processing of the increasing network traffic produced by IoT networks. Specifically, we target the processing involved in security applications and develop a new, data parallel algorithm for pattern matching at high rates. We target the vectorization capabilities found in modern, high-end architectures and show how cache locality and data parallelism can achieve up to \textit{three} times higher processing throughput than the state of the art. Second, we focus on the processing involved close to the sources of data. We target the problem of continuously monitoring sensor streams \textemdash a basic building block for many IoT applications. \ua0We show how distributed and communication-efficient monitoring algorithms can fit in real IoT devices and give insights of their behavior in conjunction with the underlying network stack

    Topics in Power Usage in Network Services

    Get PDF
    The rapid advance of computing technology has created a world powered by millions of computers. Often these computers are idly consuming energy unnecessarily in spite of all the efforts of hardware manufacturers. This thesis examines proposals to determine when to power down computers without negatively impacting on the service they are used to deliver, compares and contrasts the efficiency of virtualisation with containerisation, and investigates the energy efficiency of the popular cryptocurrency Bitcoin. We begin by examining the current corpus of literature and defining the key terms we need to proceed. Then we propose a technique for improving the energy consumption of servers by moving them into a sleep state and employing a low powered device to act as a proxy in its place. After this we move on to investigate the energy efficiency of virtualisation and compare the energy efficiency of two of the most common means used to do this. Moving on from this we look at the cryptocurrency Bitcoin. We consider the energy consumption of bitcoin mining and if this compared with the value of bitcoin makes this profitable. Finally we conclude by summarising the results and findings of this thesis. This work increases our understanding of some of the challenges of energy efficient computation as well as proposing novel mechanisms to save energy

    Gestion de la Sécurité pour le Cyber-Espace - Du Monitorage Intelligent à la Configuration Automatique

    Get PDF
    The Internet has become a great integration platform capable of efficiently interconnecting billions of entities, from simple sensors to large data centers. This platform provides access to multiple hardware and virtualized resources (servers, networking, storage, applications, connected objects) ranging from cloud computing to Internet-of-Things infrastructures. From these resources that may be hosted and distributed amongst different providers and tenants, the building and operation of complex and value-added networked systems is enabled. These systems arehowever exposed to a large variety of security attacks, that are also gaining in sophistication and coordination. In that context, the objective of my research work is to support security management for the cyberspace, with the elaboration of new monitoring and configuration solutionsfor these systems. A first axis of this work has focused on the investigation of smart monitoring methods capable to cope with low-resource networks. In particular, we have proposed a lightweight monitoring architecture for detecting security attacks in low-power and lossy net-works, by exploiting different features provided by a routing protocol specifically developed for them. A second axis has concerned the assessment and remediation of vulnerabilities that may occur when changes are operated on system configurations. Using standardized vulnerability descriptions, we have designed and implemented dedicated strategies for improving the coverage and efficiency of vulnerability assessment activities based on versioning and probabilistic techniques, and for preventing the occurrence of new configuration vulnerabilities during remediation operations. A third axis has been dedicated to the automated configuration of virtualized resources to support security management. In particular, we have introduced a software-defined security approach for configuring cloud infrastructures, and have analyzed to what extent programmability facilities can contribute to their protection at the earliest stage, through the dynamic generation of specialized system images that are characterized by low attack surfaces. Complementarily, we have worked on building and verification techniques for supporting the orchestration of security chains, that are composed of virtualized network functions, such as firewalls or intrusion detection systems. Finally, several research perspectives on security automation are pointed out with respect to ensemble methods, composite services and verified artificial intelligence.L’Internet est devenu une formidable plateforme d’intĂ©gration capable d’interconnecter efficacement des milliards d’entitĂ©s, de simples capteurs Ă  de grands centres de donnĂ©es. Cette plateforme fournit un accĂšs Ă  de multiples ressources physiques ou virtuelles, allant des infra-structures cloud Ă  l’internet des objets. Il est possible de construire et d’opĂ©rer des systĂšmes complexes et Ă  valeur ajoutĂ©e Ă  partir de ces ressources, qui peuvent ĂȘtre dĂ©ployĂ©es auprĂšs de diffĂ©rents fournisseurs. Ces systĂšmes sont cependant exposĂ©s Ă  une grande variĂ©tĂ© d’attaques qui sont de plus en plus sophistiquĂ©es. Dans ce contexte, l’objectif de mes travaux de recherche porte sur une meilleure gestion de la sĂ©curitĂ© pour le cyberespace, avec l’élaboration de nouvelles solutions de monitorage et de configuration pour ces systĂšmes. Un premier axe de ce travail s’est focalisĂ© sur l’investigation de mĂ©thodes de monitorage capables de rĂ©pondre aux exigences de rĂ©seaux Ă  faibles ressources. En particulier, nous avons proposĂ© une architecture de surveillance adaptĂ©e Ă  la dĂ©tection d’attaques dans les rĂ©seaux Ă  faible puissance et Ă  fort taux de perte, en exploitant diffĂ©rentes fonctionnalitĂ©s fournies par un protocole de routage spĂ©cifiquement dĂ©veloppĂ©pour ceux-ci. Un second axe a ensuite concernĂ© la dĂ©tection et le traitement des vulnĂ©rabilitĂ©s pouvant survenir lorsque des changements sont opĂ©rĂ©s sur la configuration de tels systĂšmes. En s’appuyant sur des bases de descriptions de vulnĂ©rabilitĂ©s, nous avons conçu et mis en Ɠuvre diffĂ©rentes stratĂ©gies permettant d’amĂ©liorer la couverture et l’efficacitĂ© des activitĂ©s de dĂ©tection des vulnĂ©rabilitĂ©s, et de prĂ©venir l’occurrence de nouvelles vulnĂ©rabilitĂ©s lors des activitĂ©s de traitement. Un troisiĂšme axe fut consacrĂ© Ă  la configuration automatique de ressources virtuelles pour la gestion de la sĂ©curitĂ©. En particulier, nous avons introduit une approche de programmabilitĂ© de la sĂ©curitĂ© pour les infrastructures cloud, et avons analysĂ© dans quelle mesure celle-ci contribue Ă  une protection au plus tĂŽt des ressources, Ă  travers la gĂ©nĂ©ration dynamique d’images systĂšmes spĂ©cialisĂ©es ayant une faible surface d’attaques. De façon complĂ©mentaire, nous avonstravaillĂ© sur des techniques de construction automatique et de vĂ©rification de chaĂźnes de sĂ©curitĂ©, qui sont composĂ©es de fonctions rĂ©seaux virtuelles telles que pare-feux ou systĂšmes de dĂ©tection d’intrusion. Enfin, plusieurs perspectives de recherche relatives Ă  la sĂ©curitĂ© autonome sont mises en Ă©vidence concernant l’usage de mĂ©thodes ensemblistes, la composition de services, et la vĂ©rification de techniques d’intelligence artificielle

    Environmental sustainability in basic research:a perspective from HECAP+

    Get PDF
    The climate crisis and the degradation of the world's ecosystems require humanity to take immediate action. The international scientific community has a responsibility to limit the negative environmental impacts of basic research. The HECAP+ communities (High Energy Physics, Cosmology, Astroparticle Physics, and Hadron and Nuclear Physics) make use of common and similar experimental infrastructure, such as accelerators and observatories, and rely similarly on the processing of big data. Our communities therefore face similar challenges to improving the sustainability of our research. This document aims to reflect on the environmental impacts of our work practices and research infrastructure, to highlight best practice, to make recommendations for positive changes, and to identify the opportunities and challenges that such changes present for wider aspects of social responsibility

    High performance cloud computing on multicore computers

    Get PDF
    The cloud has become a major computing platform, with virtualization being a key to allow applications to run and share the resources in the cloud. A wide spectrum of applications need to process large amounts of data at high speeds in the cloud, e.g., analyzing customer data to find out purchase behavior, processing location data to determine geographical trends, or mining social media data to assess brand sentiment. To achieve high performance, these applications create and use multiple threads running on multicore processors. However, existing virtualization technology cannot support the efficient execution of such applications on virtual machines, making them suffer poor and unstable performance in the cloud. Targeting multi-threaded applications, the dissertation analyzes and diagnoses their performance issues on virtual machines, and designs practical solutions to improve their performance. The dissertation makes the following contributions. First, the dissertation conducts extensive experiments with standard multicore applications, in order to evaluate the performance overhead on virtualization systems and diagnose the causing factors. Second, focusing on one main source of the performance overhead, excessive spinning, the dissertation designs and evaluates a holistic solution to make effective utilization of the hardware virtualization support in processors to reduce excessive spinning with low cost. Third, focusing on application scalability, which is the most important performance feature for multi-threaded applications, the dissertation models application scalability in virtual machines and analyzes how application scalability changes with virtualization and resource sharing. Based on the modeling and analysis, the dissertation identifies key application features and system factors that have impacts on application scalability, and reveals possible approaches for improving scalability. Forth, the dissertation explores one approach to improving application scalability by making fully utilization of virtual resources of each virtual machine. The general idea is to match the workload distribution among the virtual CPUs in a virtual machine and the virtual CPU resource of the virtual machine manager

    Environmental sustainability in basic research: a perspective from HECAP+

    Full text link
    The climate crisis and the degradation of the world's ecosystems require humanity to take immediate action. The international scientific community has a responsibility to limit the negative environmental impacts of basic research. The HECAP+ communities (High Energy Physics, Cosmology, Astroparticle Physics, and Hadron and Nuclear Physics) make use of common and similar experimental infrastructure, such as accelerators and observatories, and rely similarly on the processing of big data. Our communities therefore face similar challenges to improving the sustainability of our research. This document aims to reflect on the environmental impacts of our work practices and research infrastructure, to highlight best practice, to make recommendations for positive changes, and to identify the opportunities and challenges that such changes present for wider aspects of social responsibility.Comment: 158 pages, 21 figures; comments welcome. Revisions included in Version 2.0 are detailed on page 3 of the pdf. If you would like to endorse this document please visit: https://sustainable-hecap-plus.github.io/. An HTML version of this document is available at: https://sustainable-hecap-plus.github.io
    • 

    corecore