Search CORE

148 research outputs found

Topology-aware GPU scheduling for learning workloads in cloud environments

Author: Amaral Marcelo
Carrera David
Polo Bardés Jordà
Seelam Seetharami
Steinder Malgorzata
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/11/2017
Field of study

Recent advances in hardware, such as systems with multiple GPUs and their availability in the cloud, are enabling deep learning in various domains including health care, autonomous vehicles, and Internet of Things. Multi-GPU systems exhibit complex connectivity among GPUs and between GPUs and CPUs. Workload schedulers must consider hardware topology and workload communication requirements in order to allocate CPU and GPU resources for optimal execution time and improved utilization in shared cloud environments. This paper presents a new topology-aware workload placement strategy to schedule deep learning jobs on multi-GPU systems. The placement strategy is evaluated with a prototype on a Power8 machine with Tesla P100 cards, showing speedups of up to ≈1.30x compared to state-of-the-art strategies; the proposed algorithm achieves this result by allocating GPUs that satisfy workload requirements while preventing interference. Additionally, a large-scale simulation shows that the proposed strategy provides higher resource utilization and performance in cloud systems.This project is supported by the IBM/BSC Technology Center for Supercomputing collaboration agreement. It has also received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 639595). It is also partially supported by the Ministry of Economy of Spain under contract TIN2015-65316-P and Generalitat de Catalunya under contract 2014SGR1051, by the ICREA Academia program, and by the BSC-CNS Severo Ochoa program (SEV-2015-0493). We thank our IBM Research colleagues Alaa Youssef and Asser Tantawi for the valuable discussions. We also thank SC17 committee member Blair Bethwaite of Monash University for his constructive feedback on the earlier drafts of this paper.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Tree-Encoded Bitmaps

Author: Athanassoulis Manos
Chan Chee Yong
Chan Chee Yong
Clark David R.
Francc
González Rodrigo
Jacobson Guy
Johnson Theodore
Li Yinan
MacNicol Roger
Moerkotte Guido
O'Neil Patrick E.
O'Neil Patrick E.
Polychroniou Orestis
Rinfret Denis
Vigna Sebastiano
Wang Bo
Wu Kesheng
Wu Kun-Lung
Wu Ming-Chuan
Yu Jia
Zhou Dong
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/06/2020
Field of study

We propose a novel method to represent compressed bitmaps. Similarly to existing bitmap compression schemes, we exploit the compression potential of bitmaps populated with consecutive identical bits, i.e., 0-runs and 1-runs. But in contrast to prior work, our approach employs a binary tree structure to represent runs of various lengths. Leaf nodes in the upper tree levels thereby represent longer runs, and vice versa. The tree-based representation results in high compression ratios and enables efficient random access, which in turn allows for the fast intersection of bitmaps. Our experimental analysis with randomly generated bitmaps shows that our approach significantly improves over state-of-the-art compression techniques when bitmaps are dense and/or only barely clustered. Further, we evaluate our approach with real-world data sets, showing that our tree-encoded bitmaps can save up to one third of the space over existing techniques

Crossref

CWI's Institutional Repository

Transactional and analytical data management on persistent memory

Author: Götze Philipp
Publication venue
Publication date: 01/01/2022
Field of study

Die zunehmende Anzahl von Smart-Geräten und Sensoren, aber auch die sozialen Medien lassen das Datenvolumen und damit die geforderte Verarbeitungsgeschwindigkeit stetig wachsen. Gleichzeitig müssen viele Anwendungen Daten persistent speichern oder sogar strenge Transaktionsgarantien einhalten. Die neuartige Speichertechnologie Persistent Memory (PMem) mit ihren einzigartigen Eigenschaften scheint ein natürlicher Anwärter zu sein, um diesen Anforderungen effizient nachzukommen. Sie ist im Vergleich zu DRAM skalierbarer, günstiger und dauerhaft. Im Gegensatz zu Disks ist sie deutlich schneller und direkt adressierbar. Daher wird in dieser Dissertation der gezielte Einsatz von PMem untersucht, um den Anforderungen moderner Anwendung gerecht zu werden. Nach der Darlegung der grundlegenden Arbeitsweise von und mit PMem, konzentrieren wir uns primär auf drei Aspekte der Datenverwaltung. Zunächst zerlegen wir mehrere persistente Daten- und Indexstrukturen in ihre zugrundeliegenden Entwurfsprimitive, um Abwägungen für verschiedene Zugriffsmuster aufzuzeigen. So können wir ihre besten Anwendungsfälle und Schwachstellen, aber auch allgemeine Erkenntnisse über das Entwerfen von PMem-basierten Datenstrukturen ermitteln. Zweitens schlagen wir zwei Speicherlayouts vor, die auf analytische Arbeitslasten abzielen und eine effiziente Abfrageausführung auf beliebigen Attributen ermöglichen. Während der erste Ansatz eine verknüpfte Liste von mehrdimensionalen gruppierten Blöcken verwendet, handelt es sich beim zweiten Ansatz um einen mehrdimensionalen Index, der Knoten im DRAM zwischenspeichert. Drittens zeigen wir unter Verwendung der bisherigen Datenstrukturen und Erkenntnisse, wie Datenstrom- und Ereignisverarbeitungssysteme mit transaktionaler Zustandsverwaltung verbessert werden können. Dabei schlagen wir ein neuartiges Transactional Stream Processing (TSP) Modell mit geeigneten Konsistenz- und Nebenläufigkeitsprotokollen vor, die an PMem angepasst sind. Zusammen sollen die diskutierten Aspekte eine Grundlage für die Entwicklung noch ausgereifterer PMem-fähiger Systeme bilden. Gleichzeitig zeigen sie, wie Datenverwaltungsaufgaben PMem ausnutzen können, indem sie neue Anwendungsgebiete erschließen, die Leistung, Skalierbarkeit und Wiederherstellungsgarantien verbessern, die Codekomplexität vereinfachen sowie die ökonomischen und ökologischen Kosten reduzieren.The increasing number of smart devices and sensors, but also social media are causing the volume of data and thus the demanded processing speed to grow steadily. At the same time, many applications need to store data persistently or even comply with strict transactional guarantees. The novel storage technology Persistent Memory (PMem), with its unique properties, seems to be a natural candidate to meet these requirements efficiently. Compared to DRAM, it is more scalable, less expensive, and durable. In contrast to disks, it is significantly faster and directly addressable. Therefore, this dissertation investigates the deliberate employment of PMem to fit the needs of modern applications. After presenting the fundamental work of and with PMem, we focus primarily on three aspects of data management. First, we disassemble several persistent data and index structures into their underlying design primitives to reveal the trade-offs for various access patterns. It allows us to identify their best use cases and vulnerabilities but also to gain general insights into the design of PMem-based data structures. Second, we propose two storage layouts that target analytical workloads and enable an efficient query execution on arbitrary attributes. While the first approach employs a linked list of multi-dimensional clustered blocks that potentially span several storage layers, the second approach is a multi-dimensional index that caches nodes in DRAM. Third, we show how to improve stream and event processing systems involving transactional state management using the preceding data structures and insights. In this context, we propose a novel Transactional Stream Processing (TSP) model with appropriate consistency and concurrency protocols adapted to PMem. Together, the discussed aspects are intended to provide a foundation for developing even more sophisticated PMemenabled systems. At the same time, they show how data management tasks can take advantage of PMem by opening up new application domains, improving performance, scalability, and recovery guarantees, simplifying code complexity, plus reducing economic and environmental costs

Digitale Bibliothek Thüringen

Parallel and Distributed Processing in the Context of Fog Computing: High Throughput Pattern Matching and Distributed Monitoring

Author: Stylianopoulos Charalampos
Publication venue
Publication date: 01/01/2018
Field of study

With the introduction of the Internet of Things (IoT), physical objects now have cyber counterparts that create and communicate data. Extracting valuable information from that data requires timely and accurate processing, which calls for more efficient, distributed approaches. In order to address this challenge, the fog computing approach has been suggested as an extension to cloud processing. Fog builds on the opportunity to distribute computation to a wider range of possible platforms: data processing can happen at high-end servers in the cloud, at intermediate nodes where the data is aggregated, as well as at the resource-constrained devices that produce the data in the first place.In this work, we focus on efficient utilization of the diverse hardware resources found in the fog and identify and address challenges in computation and communication. To this end, we target two applications that are representative examples of the processing involved across a wide spectrum of computing platforms. First, we address the need for high throughput processing of the increasing network traffic produced by IoT networks. Specifically, we target the processing involved in security applications and develop a new, data parallel algorithm for pattern matching at high rates. We target the vectorization capabilities found in modern, high-end architectures and show how cache locality and data parallelism can achieve up to \textit{three} times higher processing throughput than the state of the art. Second, we focus on the processing involved close to the sources of data. We target the problem of continuously monitoring sensor streams \textemdash a basic building block for many IoT applications. \ua0We show how distributed and communication-efficient monitoring algorithms can fit in real IoT devices and give insights of their behavior in conjunction with the underlying network stack

Chalmers Research

Topics in Power Usage in Network Services

Author: O'Dwyer Karl James
Publication venue
Publication date: 01/01/2017
Field of study

The rapid advance of computing technology has created a world powered by millions of computers. Often these computers are idly consuming energy unnecessarily in spite of all the efforts of hardware manufacturers. This thesis examines proposals to determine when to power down computers without negatively impacting on the service they are used to deliver, compares and contrasts the efficiency of virtualisation with containerisation, and investigates the energy efficiency of the popular cryptocurrency Bitcoin. We begin by examining the current corpus of literature and defining the key terms we need to proceed. Then we propose a technique for improving the energy consumption of servers by moving them into a sleep state and employing a low powered device to act as a proxy in its place. After this we move on to investigate the energy efficiency of virtualisation and compare the energy efficiency of two of the most common means used to do this. Moving on from this we look at the cryptocurrency Bitcoin. We consider the energy consumption of bitcoin mining and if this compared with the value of bitcoin makes this profitable. Finally we conclude by summarising the results and findings of this thesis. This work increases our understanding of some of the challenges of energy efficient computation as well as proposing novel mechanisms to save energy

MURAL - Maynooth University Research Archive Library

Gestion de la Sécurité pour le Cyber-Espace - Du Monitorage Intelligent à la Configuration Automatique

Author: Badonnel Rémi
Publication venue: HAL CCSD
Publication date: 09/03/2022
Field of study

The Internet has become a great integration platform capable of efficiently interconnecting billions of entities, from simple sensors to large data centers. This platform provides access to multiple hardware and virtualized resources (servers, networking, storage, applications, connected objects) ranging from cloud computing to Internet-of-Things infrastructures. From these resources that may be hosted and distributed amongst different providers and tenants, the building and operation of complex and value-added networked systems is enabled. These systems arehowever exposed to a large variety of security attacks, that are also gaining in sophistication and coordination. In that context, the objective of my research work is to support security management for the cyberspace, with the elaboration of new monitoring and configuration solutionsfor these systems. A first axis of this work has focused on the investigation of smart monitoring methods capable to cope with low-resource networks. In particular, we have proposed a lightweight monitoring architecture for detecting security attacks in low-power and lossy net-works, by exploiting different features provided by a routing protocol specifically developed for them. A second axis has concerned the assessment and remediation of vulnerabilities that may occur when changes are operated on system configurations. Using standardized vulnerability descriptions, we have designed and implemented dedicated strategies for improving the coverage and efficiency of vulnerability assessment activities based on versioning and probabilistic techniques, and for preventing the occurrence of new configuration vulnerabilities during remediation operations. A third axis has been dedicated to the automated configuration of virtualized resources to support security management. In particular, we have introduced a software-defined security approach for configuring cloud infrastructures, and have analyzed to what extent programmability facilities can contribute to their protection at the earliest stage, through the dynamic generation of specialized system images that are characterized by low attack surfaces. Complementarily, we have worked on building and verification techniques for supporting the orchestration of security chains, that are composed of virtualized network functions, such as firewalls or intrusion detection systems. Finally, several research perspectives on security automation are pointed out with respect to ensemble methods, composite services and verified artificial intelligence.L’Internet est devenu une formidable plateforme d’intégration capable d’interconnecter efficacement des milliards d’entités, de simples capteurs à de grands centres de données. Cette plateforme fournit un accès à de multiples ressources physiques ou virtuelles, allant des infra-structures cloud à l’internet des objets. Il est possible de construire et d’opérer des systèmes complexes et à valeur ajoutée à partir de ces ressources, qui peuvent être déployées auprès de différents fournisseurs. Ces systèmes sont cependant exposés à une grande variété d’attaques qui sont de plus en plus sophistiquées. Dans ce contexte, l’objectif de mes travaux de recherche porte sur une meilleure gestion de la sécurité pour le cyberespace, avec l’élaboration de nouvelles solutions de monitorage et de configuration pour ces systèmes. Un premier axe de ce travail s’est focalisé sur l’investigation de méthodes de monitorage capables de répondre aux exigences de réseaux à faibles ressources. En particulier, nous avons proposé une architecture de surveillance adaptée à la détection d’attaques dans les réseaux à faible puissance et à fort taux de perte, en exploitant différentes fonctionnalités fournies par un protocole de routage spécifiquement développépour ceux-ci. Un second axe a ensuite concerné la détection et le traitement des vulnérabilités pouvant survenir lorsque des changements sont opérés sur la configuration de tels systèmes. En s’appuyant sur des bases de descriptions de vulnérabilités, nous avons conçu et mis en œuvre différentes stratégies permettant d’améliorer la couverture et l’efficacité des activités de détection des vulnérabilités, et de prévenir l’occurrence de nouvelles vulnérabilités lors des activités de traitement. Un troisième axe fut consacré à la configuration automatique de ressources virtuelles pour la gestion de la sécurité. En particulier, nous avons introduit une approche de programmabilité de la sécurité pour les infrastructures cloud, et avons analysé dans quelle mesure celle-ci contribue à une protection au plus tôt des ressources, à travers la génération dynamique d’images systèmes spécialisées ayant une faible surface d’attaques. De façon complémentaire, nous avonstravaillé sur des techniques de construction automatique et de vérification de chaînes de sécurité, qui sont composées de fonctions réseaux virtuelles telles que pare-feux ou systèmes de détection d’intrusion. Enfin, plusieurs perspectives de recherche relatives à la sécurité autonome sont mises en évidence concernant l’usage de méthodes ensemblistes, la composition de services, et la vérification de techniques d’intelligence artificielle

INRIA a CCSD electronic archive server

Environmental sustainability in basic research:a perspective from HECAP+

The climate crisis and the degradation of the world's ecosystems require humanity to take immediate action. The international scientific community has a responsibility to limit the negative environmental impacts of basic research. The HECAP+ communities (High Energy Physics, Cosmology, Astroparticle Physics, and Hadron and Nuclear Physics) make use of common and similar experimental infrastructure, such as accelerators and observatories, and rely similarly on the processing of big data. Our communities therefore face similar challenges to improving the sustainability of our research. This document aims to reflect on the environmental impacts of our work practices and research infrastructure, to highlight best practice, to make recommendations for positive changes, and to identify the opportunities and challenges that such changes present for wider aspects of social responsibility

The University of Manchester - Institutional Repository

High performance cloud computing on multicore computers

Author: Shan Jianchen
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/2018
Field of study

The cloud has become a major computing platform, with virtualization being a key to allow applications to run and share the resources in the cloud. A wide spectrum of applications need to process large amounts of data at high speeds in the cloud, e.g., analyzing customer data to find out purchase behavior, processing location data to determine geographical trends, or mining social media data to assess brand sentiment. To achieve high performance, these applications create and use multiple threads running on multicore processors. However, existing virtualization technology cannot support the efficient execution of such applications on virtual machines, making them suffer poor and unstable performance in the cloud. Targeting multi-threaded applications, the dissertation analyzes and diagnoses their performance issues on virtual machines, and designs practical solutions to improve their performance. The dissertation makes the following contributions. First, the dissertation conducts extensive experiments with standard multicore applications, in order to evaluate the performance overhead on virtualization systems and diagnose the causing factors. Second, focusing on one main source of the performance overhead, excessive spinning, the dissertation designs and evaluates a holistic solution to make effective utilization of the hardware virtualization support in processors to reduce excessive spinning with low cost. Third, focusing on application scalability, which is the most important performance feature for multi-threaded applications, the dissertation models application scalability in virtual machines and analyzes how application scalability changes with virtualization and resource sharing. Based on the modeling and analysis, the dissertation identifies key application features and system factors that have impacts on application scalability, and reveals possible approaches for improving scalability. Forth, the dissertation explores one approach to improving application scalability by making fully utilization of virtual resources of each virtual machine. The general idea is to match the workload distribution among the virtual CPUs in a virtual machine and the virtual CPU resource of the virtual machine manager

Digital Commons @ New Jersey Institute of Technology (NJIT)

Environmental sustainability in basic research: a perspective from HECAP+

arXiv.org e-Print Archive