6 research outputs found

    Highlighting the Container Memory Consolidation Problems in Linux

    Get PDF
    International audienceThe container mechanism supports server consolidation ; to ensure memory performance isolation, Linux relies on static memory limits. However, this results in poor performance, because an application needs are dynamic. In this article we will show current problems with memory consolidation for containers in Linux

    Predicting Dynamic Memory Requirements for Scientific Workflow Tasks

    Full text link
    With the increasing amount of data available to scientists in disciplines as diverse as bioinformatics, physics, and remote sensing, scientific workflow systems are becoming increasingly important for composing and executing scalable data analysis pipelines. When writing such workflows, users need to specify the resources to be reserved for tasks so that sufficient resources are allocated on the target cluster infrastructure. Crucially, underestimating a task's memory requirements can result in task failures. Therefore, users often resort to overprovisioning, resulting in significant resource wastage and decreased throughput. In this paper, we propose a novel online method that uses monitoring time series data to predict task memory usage in order to reduce the memory wastage of scientific workflow tasks. Our method predicts a task's runtime, divides it into k equally-sized segments, and learns the peak memory value for each segment depending on the total file input size. We evaluate the prototype implementation of our method using workflows from the publicly available nf-core repository, showing an average memory wastage reduction of 29.48% compared to the best state-of-the-art approac

    Container Resource Allocation versus Performance of Data-intensive Applications on Different Cloud Servers

    Full text link
    In recent years, data-intensive applications have been increasingly deployed on cloud systems. Such applications utilize significant compute, memory, and I/O resources to process large volumes of data. Optimizing the performance and cost-efficiency for such applications is a non-trivial problem. The problem becomes even more challenging with the increasing use of containers, which are popular due to their lower operational overheads and faster boot speed at the cost of weaker resource assurances for the hosted applications. In this paper, two containerized data-intensive applications with very different performance objectives and resource needs were studied on cloud servers with Docker containers running on Intel Xeon E5 and AMD EPYC Rome multi-core processors with a range of CPU, memory, and I/O configurations. Primary findings from our experiments include: 1) Allocating multiple cores to a compute-intensive application can improve performance, but only if the cores do not contend for the same caches, and the optimal core counts depend on the specific workload; 2) allocating more memory to a memory-intensive application than its deterministic data workload does not further improve performance; however, 3) having multiple such memory-intensive containers on the same server can lead to cache and memory bus contention leading to significant and volatile performance degradation. The comparative observations on Intel and AMD servers provided insights into trade-offs between larger numbers of distributed chiplets interconnected with higher speed buses (AMD) and larger numbers of centrally integrated cores and caches with lesser speed buses (Intel). For the two types of applications studied, the more distributed caches and faster data buses have benefited the deployment of larger numbers of containers

    MemOpLight: Leveraging application feedback to improve container memory consolidation

    Get PDF
    International audienceThe container mechanism amortizes costs by consolidating several servers onto the same machine, while keeping them mutually isolated.Specifically, to ensure performance isolation, Linux relies on memory limits.These limits are static, despite the fact that application needs are dynamic; this results in poor performance.To solve this issue, MemOpLight uses dynamic application feedback to rebalance physical memory allocation between containers focusing on under-performing ones.This paper presents the issues, explains the design of MemOpLight, and validates it experimentally.Our approach increases total satisfaction by 13% compared to the default

    Modeling the Linux page cache for accurate simulation of data-intensive applications

    Get PDF
    The emergence of Big Data in recent years has led to a growing need in data processing and an increasing number of data intensive applications. Processing and storage of massive amounts of data require large-scale solutions and thus must data-intensive applications be executed on infrastructures such as cloud or High Performance Computing (HPC) clusters. Although there are advancements of hardware/software stack that enable larger computing platforms, some relevant challenges remain in resource management, performance, scheduling, scalability, etc. As a result, there is an increasing demand for optimizing and quantifying performance when executing data-intensive applications on those platforms. While infrastructures with sufficient computing power and storage capacity are available, the I/O performance on disks remains a bottleneck. To tackle this problem, apart from hardware improvements, the Linux page cache is an efficient architectural approach to reduce I/O overheads, but few experimental studies of its interactions with Big Data applications exist, partly due to limitations of real-world experiments. Simulation is a popular approach to address these issues, however, existing simulation frameworks do not simulate page caching fully, or even at all. As a result, simulation-based performance studies of data-intensive applications lead to inaccurate results. This thesis proposes an I/O simulation model that captures the key features of the Linux page cache. We have implemented this model as part of the WRENCH workflow simulation framework, which itself builds on the popular SimGrid distributed systems simulation framework. Our model and its implementation enable the simulation of both single-threaded and multithreaded applications, and of both writeback and writethrough caches for local or network-based filesystems. We evaluate the accuracy of our model in different conditions, including sequential and concurrent applications, as well as local and remote I/Os. The results show that our page cache model reduces the simulation error by up to an order of magnitude when compared to state-of-the-art, cacheless simulations
    corecore