93 research outputs found

    3rd workshop on hot topics in cloud computing performance (HotCloudPerf'20):Performance variability

    Get PDF
    The organizers of the Third Workshop on Hot Topics in Cloud Computing Performance (HotCloudPerf 2020) are delighted to welcome you to the workshop proceedings as part of the ICPE conference companion. The HotCloudPerf 2020 workshop is a full-day workshop on Tuesday, April 21, taking place jointly with WOSP-C as part of the ICPE conference week in Edmonton, Canada. Each year, the workshop chooses a focus theme to explore; for 2020, the theme is "Performance variability of cloud datacenters and the implications of such phenomena on application performance" Cloud computing is emerging as one of the most profound changes in the way we build and use IT. The use of global services in public clouds is increasing, and the lucrative and rapidly

    In Datacenter Performance, The Only Constant Is Change

    Full text link
    All computing infrastructure suffers from performance variability, be it bare-metal or virtualized. This phenomenon originates from many sources: some transient, such as noisy neighbors, and others more permanent but sudden, such as changes or wear in hardware, changes in the underlying hypervisor stack, or even undocumented interactions between the policies of the computing resource provider and the active workloads. Thus, performance measurements obtained on clouds, HPC facilities, and, more generally, datacenter environments are almost guaranteed to exhibit performance regimes that evolve over time, which leads to undesirable nonstationarities in application performance. In this paper, we present our analysis of performance of the bare-metal hardware available on the CloudLab testbed where we focus on quantifying the evolving performance regimes using changepoint detection. We describe our findings, backed by a dataset with nearly 6.9M benchmark results collected from over 1600 machines over a period of 2 years and 9 months. These findings yield a comprehensive characterization of real-world performance variability patterns in one computing facility, a methodology for studying such patterns on other infrastructures, and contribute to a better understanding of performance variability in general.Comment: To be presented at the 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGrid, http://cloudbus.org/ccgrid2020/) on May 11-14, 2020 in Melbourne, Victoria, Australi

    Log Parsing Evaluation in the Era of Modern Software Systems

    Full text link
    Due to the complexity and size of modern software systems, the amount of logs generated is tremendous. Hence, it is infeasible to manually investigate these data in a reasonable time, thereby requiring automating log analysis to derive insights about the functioning of the systems. Motivated by an industry use-case, we zoom-in on one integral part of automated log analysis, log parsing, which is the prerequisite to deriving any insights from logs. Our investigation reveals problematic aspects within the log parsing field, particularly its inefficiency in handling heterogeneous real-world logs. We show this by assessing the 14 most-recognized log parsing approaches in the literature using (i) nine publicly available datasets, (ii) one dataset comprised of combined publicly available data, and (iii) one dataset generated within the infrastructure of a large bank. Subsequently, toward improving log parsing robustness in real-world production scenarios, we propose a tool, Logchimera, that enables estimating log parsing performance in industry contexts through generating synthetic log data that resemble industry logs. Our contributions serve as a foundation to consolidate past research efforts, facilitate future research advancements, and establish a strong link between research and industry log parsing

    Log Parsing Evaluation in the Era of Modern Software Systems

    Get PDF
    Due to the complexity and size of modern software systems, the amount of logs generated is tremendous. Hence, it is infeasible to manually investigate these data in a reasonable time, thereby requiring automating log analysis to derive insights about the functioning of the systems. Motivated by an industry use-case, we zoom-in on one integral part of automated log analysis, log parsing, which is the prerequisite to deriving any insights from logs. Our investigation reveals problematic aspects within the log parsing field, particularly its inefficiency in handling heterogeneous real-world logs. We show this by assessing the 14 most-recognized log parsing approaches in the literature using (i) nine publicly available datasets, (ii) one dataset comprised of combined publicly available data, and (iii) one dataset generated within the infrastructure of a large bank. Subsequently, toward improving log parsing robustness in real-world production scenarios, we propose a tool, Logchimera, that enables estimating log parsing performance in industry contexts through generating synthetic log data that resemble industry logs. Our contributions serve as a foundation to consolidate past research efforts, facilitate future research advancements, and establish a strong link between research and industry log parsing

    [Demo] Low-latency spark queries on updatable data

    Get PDF
    As data science gets deployed more and more into operational applications, it becomes important for data science frameworks to be able to perform computations in interactive, sub-second time. Indexing and caching are two key techniques that can make interactive query processing on large datasets possible. In this demo, we show the design, implementation and performance of a new indexing abstraction in Apache Spark, called the Indexed DataFrame. This is a cached DataFrame that incorporates an index to support fast lookup and join operations, and supports updates with multi-version concurrency. We demonstrate the Indexed Dataframe on a social network dataset using microbench-marks and real-world graph processing queries, in datasets that are continuously growing

    SenseLE:Exploiting spatial locality in decentralized sensing environments

    Get PDF
    Generally, smart devices, such as smartphones, smartwatches, or fitness trackers, communicate with each other indirectly, via cloud data centers. Sharing sensor data with a cloud data center as intermediary invokes transmission methods with high battery costs, such as 4G LTE or WiFi. By sharing sensor information locally and without intermediaries, we can use other transmission methods with low energy cost, such as Bluetooth or BLE. In this paper, we introduce Sense Low Energy (SenseLE), a decentralized sensing framework which exploits the spatial locality of nearby sensors to save energy in Internet-of-Things (IoT) environments. We demonstrate the usability of SenseLE by building a real-life application for estimating waiting times at queues. Furthermore, we evaluate the performance and resource utilization of our SenseLE Android implementation for different sensing scenarios. Our empirical evaluation shows that by exploiting spatial locality, SenseLE is able to reduce application response times (latency) by up to 74% and energy consumption by up to 56%
    • …
    corecore