Search CORE

13 research outputs found

The Workflow Trace Archive: Open-Access Data from Public and Private Computing Infrastructures -- Technical Report

Author: Deelman Ewa
Hegeman Tim
Iosup Alexandru
Mathá Roland
Prodan Radu
Talluri Sacheendra
Versluis Laurens
Publication venue
Publication date: 11/07/2019
Field of study

Realistic, relevant, and reproducible experiments often need input traces collected from real-world environments. We focus in this work on traces of workflows---common in datacenters, clouds, and HPC infrastructures. We show that the state-of-the-art in using workflow-traces raises important issues: (1) the use of realistic traces is infrequent, and (2) the use of realistic, {\it open-access} traces even more so. Alleviating these issues, we introduce the Workflow Trace Archive (WTA), an open-access archive of workflow traces from diverse computing infrastructures and tooling to parse, validate, and analyze traces. The WTA includes

{>}48

million workflows captured from

{>}10

computing infrastructures, representing a broad diversity of trace domains and characteristics. To emphasize the importance of trace diversity, we characterize the WTA contents and analyze in simulation the impact of trace diversity on experiment results. Our results indicate significant differences in characteristics, properties, and workflow structures between workload sources, domains, and fields.Comment: Technical repor

arXiv.org e-Print Archive

VU Research Portal

DeepPlace: Learning to Place Applications in Multi-Tenant Clusters

Author: Dhake Neeraj
Mitra Subrata
Mondal Shanka Subhra
Nehra Ravinder
Sheoran Nikhil
Simha Ramanuja
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Large multi-tenant production clusters often have to handle a variety of jobs and applications with a variety of complex resource usage characteristics. It is non-trivial and non-optimal to manually create placement rules for scheduling that would decide which applications should co-locate. In this paper, we present DeepPlace, a scheduler that learns to exploits various temporal resource usage patterns of applications using Deep Reinforcement Learning (Deep RL) to reduce resource competition across jobs running in the same machine while at the same time optimizing for overall cluster utilization.Comment: APSys 201

arXiv.org e-Print Archive

Crossref

In Datacenter Performance, The Only Constant Is Change

Author: Duplyakin Dmitry
Maricq Aleksander
Ricci Robert
Uta Alexandru
Publication venue
Publication date: 10/03/2020
Field of study

All computing infrastructure suffers from performance variability, be it bare-metal or virtualized. This phenomenon originates from many sources: some transient, such as noisy neighbors, and others more permanent but sudden, such as changes or wear in hardware, changes in the underlying hypervisor stack, or even undocumented interactions between the policies of the computing resource provider and the active workloads. Thus, performance measurements obtained on clouds, HPC facilities, and, more generally, datacenter environments are almost guaranteed to exhibit performance regimes that evolve over time, which leads to undesirable nonstationarities in application performance. In this paper, we present our analysis of performance of the bare-metal hardware available on the CloudLab testbed where we focus on quantifying the evolving performance regimes using changepoint detection. We describe our findings, backed by a dataset with nearly 6.9M benchmark results collected from over 1600 machines over a period of 2 years and 9 months. These findings yield a comprehensive characterization of real-world performance variability patterns in one computing facility, a methodology for studying such patterns on other infrastructures, and contribute to a better understanding of performance variability in general.Comment: To be presented at the 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGrid, http://cloudbus.org/ccgrid2020/) on May 11-14, 2020 in Melbourne, Victoria, Australi

arXiv.org e-Print Archive

VU Research Portal

Crossref

The workflow trace archive:Open-access data from public and private computing infrastructures

Author: Deelman Ewa
Hegeman Tim
Iosup Alexandru
Matha Roland
Prodan Radu
Talluri Sacheendra
Versluis Laurens
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2020
Field of study

Realistic, relevant, and reproducible experiments often need input traces collected from real-world environments. In this work, we focus on traces of workflows - common in datacenters, clouds, and HPC infrastructures. We show that the state-of-the-art in using workflow-traces raises important issues: (1) the use of realistic traces is infrequent and (2) the use of realistic, open-access traces even more so. Alleviating these issues, we introduce the Workflow Trace Archive (WTA), an open-access archive of workflow traces from diverse computing infrastructures and tooling to parse, validate, and analyze traces. The WTA includes {>}48>48 million workflows captured from {>}10>10 computing infrastructures, representing a broad diversity of trace domains and characteristics. To emphasize the importance of trace diversity, we characterize the WTA contents and analyze in simulation the impact of trace diversity on experiment results. Our results indicate significant differences in characteristics, properties, and workflow structures between workload sources, domains, and fields

VU Research Portal

Analysis of the 2018 Alibaba's cluster trace

Author: Κυρικλίδης Μάριος-Ερρίκος Κ.
Publication venue
Publication date: 01/01/2019
Field of study

University of Thessaly Institutional Repository

Traffic generation for benchmarking data centre networks

Author: Benjamin JL
Parsonson CWF
Zervas G
Publication venue: 'Elsevier BV'
Publication date: 01/11/2022
Field of study

Benchmarking is commonly used in research fields, such as computer architecture design and machine learning, as a powerful paradigm for rigorously assessing, comparing, and developing novel technologies. However, the data centre network (DCN) community lacks a standard open-access and reproducible traffic generation framework for benchmark workload generation. Driving factors behind this include the proprietary nature of traffic traces, the limited detail and quantity of open-access network-level data sets, the high cost of real world experimentation, and the poor reproducibility and fidelity of synthetically generated traffic. This is curtailing the community's understanding of existing systems and hindering the ability with which novel technologies, such as optical DCNs, can be developed, compared, and tested. We present TrafPy; an open-access framework for generating both realistic and custom DCN traffic traces. TrafPy is compatible with any simulation, emulation, or experimentation environment, and can be used for standardised benchmarking and for investigating the properties and limitations of network systems such as schedulers, switches, routers, and resource managers. We give an overview of the TrafPy traffic generation framework, and provide a brief demonstration of its efficacy through an investigation into the sensitivity of some canonical scheduling algorithms to varying traffic trace characteristics in the context of optical DCNs. TrafPy is open-sourced via GitHub and all data associated with this manuscript via RDR

UCL Discovery

Shadow: Exploiting the Power of Choice for Efficient Shuffling in MapReduce

Author: Chen Hanhua
Ibrahim Shadi
Jin Hai
Wu Sijie
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/09/2019
Field of study

International audienc

INRIA a CCSD electronic archive server