Search CORE

1,268 research outputs found

Workload Interleaving with Performance Guarantees in Data Centers

Author: Yan Feng
Publication venue: W&M ScholarWorks
Publication date: 01/10/2016
Field of study

In the era of global, large scale data centers residing in clouds, many applications and users share the same pool of resources for the purposes of reducing energy and operating costs, and of improving availability and reliability. Along with the above benefits, resource sharing also introduces performance challenges: when multiple workloads access the same resources concurrently, contention may occur and introduce delays in the performance of individual workloads. Providing performance isolation to individual workloads needs effective management methodologies. The challenges of deriving effective management methodologies lie in finding accurate, robust, compact metrics and models to drive algorithms that can meet different performance objectives while achieving efficient utilization of resources. This dissertation proposes a set of methodologies aiming at solving the challenging performance isolation problem in workload interleaving in data centers, focusing on both storage components and computing components. at the storage node level, we focus on methodologies for better interleaving user traffic with background workloads, such as tasks for improving reliability, availability, and power savings. More specifically, a scheduling policy for background workload based on the statistical characteristics of the system busy periods and a methodology that quantitatively estimates the performance impact of power savings are developed. at the storage cluster level, we consider methodologies on how to efficiently conduct work consolidation and schedule asynchronous updates without violating user performance targets. More specifically, we develop a framework that can estimate beforehand the benefits and overheads of each option in order to automate the process of reaching intelligent consolidation decisions while achieving faster eventual consistency. at the computing node level, we focus on improving workload interleaving at off-the-shelf servers as they are the basic building blocks of large-scale data centers. We develop priority scheduling middleware that employs different policies to schedule background tasks based on the instantaneous resource requirements of the high priority applications running on the server node. Finally, at the computing cluster level, we investigate popular computing frameworks for large-scale data intensive distributed processing, such as MapReduce and its Hadoop implementation. We develop a new Hadoop scheduler called DyScale to exploit capabilities offered by heterogeneous cores in order to achieve a variety of performance objectives

College of William & Mary: W&M Publish

Optimizing Virtual Machine I/O Performance in Virtualized Cloud by Differenciated-frequency Scheduling and Functionality Offloading

Author: Xu Cong
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2015
Field of study

Many enterprises are increasingly moving their applications to private cloud environments or public cloud platforms. A key technology driving cloud computing is virtualization which can serve multiple VMs in one physical machine hence providing better management flexibility and significant savings in operational costs. However, one important consequence of virtualized hosts in the cloud is the negative impact it has on the I/O performance of the applications running in the VMs

Purdue E-Pubs

Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis

Author: Ben-Nun Tal
Hoefler Torsten
Publication venue
Publication date: 15/09/2018
Field of study

Deep Neural Networks (DNNs) are becoming an important tool in modern computing applications. Accelerating their training is a major challenge and techniques range from distributed algorithms to low-level circuit design. In this survey, we describe the problem from a theoretical perspective, followed by approaches for its parallelization. We present trends in DNN architectures and the resulting implications on parallelization strategies. We then review and model the different types of concurrency in DNNs: from the single operator, through parallelism in network inference and training, to distributed deep learning. We discuss asynchronous stochastic optimization, distributed system architectures, communication schemes, and neural architecture search. Based on those approaches, we extrapolate potential directions for parallelism in deep learning

arXiv.org e-Print Archive

Repository for Publications and Research Data

High Performance Computing using Infiniband-based clusters

Author: Hemmatpour Masoud
Publication venue: Politecnico di Torino
Publication date
Field of study

L'abstract è presente nell'allegato / the abstract is in the attachmen

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Scalable and Distributed Resource Management Protocols for Cloud and Big Data Clusters

Author: Khelghatdoust Mansour
Publication venue: Faculty of Engineering and Information Technologies, School of Information Technologies
Publication date: 22/08/2018
Field of study

Cloud data centers require an operating system to manage resources and satisfy operational requirements and management objectives. The growth of popularity in cloud services causes the appearance of a new spectrum of services with sophisticated workload and resource management requirements. Also, data centers are growing by addition of various type of hardware to accommodate the ever-increasing requests of users. Nowadays a large percentage of cloud resources are executing data-intensive applications which need continuously changing workload fluctuations and specific resource management. To this end, cluster computing frameworks are shifting towards distributed resource management for better scalability and faster decision making. Such systems benefit from the parallelization of control and are resilient to failures. Throughout this thesis we investigate algorithms, protocols and techniques to address these challenges in large-scale data centers. We introduce a distributed resource management framework which consolidates virtual machine to as few servers as possible to reduce the energy consumption of data center and hence decrease the cost of cloud providers. This framework can characterize the workload of virtual machines and hence handle trade-off energy consumption and Service Level Agreement (SLA) of customers efficiently. The algorithm is highly scalable and requires low maintenance cost with dynamic workloads and it tries to minimize virtual machines migration costs. We also introduce a scalable and distributed probe-based scheduling algorithm for Big data analytics frameworks. This algorithm can efficiently address the problem job heterogeneity in workloads that has appeared after increasing the level of parallelism in jobs. The algorithm is massively scalable and can reduce significantly average job completion times in comparison with the-state of-the-art. Finally, we propose a probabilistic fault-tolerance technique as part of the scheduling algorithm

Sydney eScholarship

Database and System Design for Emerging Storage Technologies

Author: Pelley Steven
Publication venue
Publication date
Field of study

Emerging storage technologies offer an alternative to disk that is durable and allows faster data access. Flash memory, made popular by mobile devices, provides block access with low latency random reads. New nonvolatile memories (NVRAM) are expected in upcoming years, presenting DRAM-like performance alongside persistent storage. Whereas both technologies accelerate data accesses due to increased raw speed, used merely as disk replacements they may fail to achieve their full potentials. Flash’s asymmetric read/write access (i.e., reads execute faster than writes opens new opportunities to optimize Flash-specific access. Similarly, NVRAM’s low latency persistent accesses allow new designs for high performance failure-resistant applications. This dissertation addresses software and hardware system design for such storage technologies. First, I investigate analytics query optimization for Flash, expecting Flash’s fast random access to require new query planning. While intuition suggests scan and join selection should shift between disk and Flash, I find that query plans chosen assuming disk are already near-optimal for Flash. Second, I examine new opportunities for durable, recoverable transaction processing with NVRAM. Existing disk-based recovery mechanisms impose large software overheads, yet updating data in-place requires frequent device synchronization that limits throughput. I introduce a new design, NVRAM Group Commit, to amortize synchronization delays over many transactions, increasing throughput at some cost to transaction latency. Finally, I propose a new framework for persistent programming and memory systems to enable high performance recoverable data structures with NVRAM, extending memory consistency with persistent semantics to introduce memory persistency.PhDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/107114/1/spelley_1.pd

Deep Blue Documents at the University of Michigan

Resource-Efficient Replication and Migration of Virtual Machines.

Author: Hou Kai-Yuan
Publication venue
Publication date: 01/01/2014
Field of study

Continuous replication and live migration of Virtual Machines (VMs) are two vital tools in a virtualized environment, but they are resource-expensive. Continuously replicating a VM's checkpointed state to a backup host maintains high-availability (HA) of the VM despite host failures, but checkpoint replication can generate significant network traffic. Each replicated VM also incurs a 100% memory overhead, since the backup unproductively reserves the same amount of memory to hold the redundant VM state. Live migration, though being widely used for load-balancing, power-saving, etc., can also generate excessive network traffic, by transferring VM state iteratively. In addition, it can incur a long completion time and degrade application performance. This thesis explores ways to replicate VMs for HA using resources efficiently, and to migrate VMs fast, with minimal execution disruption and using resources efficiently. First, we investigate the tradeoffs in using different compression methods to reduce the network traffic of checkpoint replication in a HA system. We evaluate gzip, delta and similarity compressions based on metrics that are specifically important in a HA system, and then suggest guidelines for their selection. Next, we propose HydraVM, a storage-based HA approach that eliminates the unproductive memory reservation made in backup hosts. HydraVM maintains a recent image of a protected VM in a shared storage by taking and consolidating incremental VM checkpoints. When a failure occurs, HydraVM quickly resumes the execution of a failed VM by loading a small amount of essential VM state from the storage. As the VM executes, the VM state not yet loaded is supplied on-demand. Finally, we propose application-assisted live migration, which skips transfer of VM memory that need not be migrated to execute running applications at the destination. We develop a generic framework for the proposed approach, and then use the framework to build JAVMM, a system that migrates VMs running Java applications skipping transfer of garbage in Java memory. Our evaluation results show that compared to Xen live migration, which is agnostic of running applications, JAVMM can reduce the completion time, network traffic and application downtime caused by Java VM migration, all by up to over 90%.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/111575/1/karenhou_1.pd

CiteSeerX

Deep Blue Documents at the University of Michigan

A survey and classification of software-defined storage systems

Author: Alysson Bessani
Angel Sebastian
Anwar Ali
Anwar Ali
Belaramani Nalini M.
Belay Adam
Carl
Cully Brendan
Frank
Ghodsi Ali
Gracia-Tinedo Raúl
Gulati Ajay
Gulati Ajay
Hat Red
Hsu Chin-Jung
Hunt Patrick
José Pereira
João Paulo
Kim Hyeong-Jun
Klimovic Ana
Koponen Teemu
Li Ning
Lumb Christopher R.
Mace Jonathan
Mesnier Michael
Murugan Muthukumar
Ongaro Diego
Peter Simon
Qian Yingjin
Raghavan Ajaykrishna
Ricardo Macedo
Riedel Erik
Schroeder Bianca
Schwan Philip
Seshadri Sudharsan
Sevilla Michael A.
Shan Yizhou
Shue David
Shue David
Soheil
Song Huaiming
Stefanovici Ioan
Weil Sage A.
Wires Jake
Yang Bin
Yang Suli
Zhang Xuechen
Zhu Timothy
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

The exponential growth of digital information is imposing increasing scale and efficiency demands on modern storage infrastructures. As infrastructure complexity increases, so does the difficulty in ensuring quality of service, maintainability, and resource fairness, raising unprecedented performance, scalability, and programmability challenges. Software-Defined Storage (SDS) addresses these challenges by cleanly disentangling control and data flows, easing management, and improving control functionality of conventional storage systems. Despite its momentum in the research community, many aspects of the paradigm are still unclear, undefined, and unexplored, leading to misunderstandings that hamper the research and development of novel SDS technologies. In this article, we present an in-depth study of SDS systems, providing a thorough description and categorization of each plane of functionality. Further, we propose a taxonomy and classification of existing SDS solutions according to different criteria. Finally, we provide key insights about the paradigm and discuss potential future research directions for the field.This work was financed by the Portuguese funding agency FCT-Fundacao para a Ciencia e a Tecnologia through national funds, the PhD grant SFRH/BD/146059/2019, the project ThreatAdapt (FCT-FNR/0002/2018), the LASIGE Research Unit (UIDB/00408/2020), and cofunded by the FEDER, where applicable

Universidade do Minho: RepositoriUM

Crossref

Energy-efficient Transitional Near-* Computing

Author: Graubner Pablo Karl
Publication venue: Philipps-Universität Marburg
Publication date: 01/01/2018
Field of study

Studies have shown that communication networks, devices accessing the Internet, and data centers account for 4.6% of the worldwide electricity consumption. Although data centers, core network equipment, and mobile devices are getting more energy-efficient, the amount of data that is being processed, transferred, and stored is vastly increasing. Recent computer paradigms, such as fog and edge computing, try to improve this situation by processing data near the user, the network, the devices, and the data itself. In this thesis, these trends are summarized under the new term near-* or near-everything computing. Furthermore, a novel paradigm designed to increase the energy efficiency of near-* computing is proposed: transitional computing. It transfers multi-mechanism transitions, a recently developed paradigm for a highly adaptable future Internet, from the field of communication systems to computing systems. Moreover, three types of novel transitions are introduced to achieve gains in energy efficiency in near-* environments, spanning from private Infrastructure-as-a-Service (IaaS) clouds, Software-defined Wireless Networks (SDWNs) at the edge of the network, Disruption-Tolerant Information-Centric Networks (DTN-ICNs) involving mobile devices, sensors, edge devices as well as programmable components on a mobile System-on-a-Chip (SoC). Finally, the novel idea of transitional near-* computing for emergency response applications is presented to assist rescuers and affected persons during an emergency event or a disaster, although connections to cloud services and social networks might be disturbed by network outages, and network bandwidth and battery power of mobile devices might be limited

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg

Bridging the gap between dataplanes and commodity operating systems

Author: Prekas Georgios
Publication venue: Lausanne, EPFL
Publication date: 23/05/2018
Field of study

The conventional wisdom is that aggressive networking requirements, such as high packet rates for small messages and microsecond-scale tail latency, are best addressed outside the kernel, in a user-level networking stack. In particular, dataplanes borrow design elements from network middleboxes to run tasks to completion in tight loops. In its basic form, the dataplane design leverages sweeping simplifications such as the elimination of any resource management and any task scheduling to improve throughput and lower latency. As a result, dataplanes perform best when the request rate is predictable (since there is no resource management) and the service time of each task has a low execution time and a low dispersion. On the other hand, they exhibit poor energy proportionality and workload consolidation, and suffer from head-of-line blocking. This thesis proposes the introduction of resource management to dataplanes. Current dataplanes decrease latency by constantly polling for incoming network packets. This approach trades energy usage for latency. We argue that it is possible to introduce a control plane, which manages the resources in the most optimal way in terms of power usage without affecting the performance of the dataplane. Additionally, this thesis proposes the introduction of scheduling to dataplanes. Current designs operate in a strict FIFO and run-to-completion manner. This method is effective only when the incoming request requires a minimal amount of processing in the order of a few microseconds. When the processing time of requests is (a) longer or (b) follows a distribution with higher dispersion, the transient load imbalances and head-of-line blocking deteriorate the performance of the dataplane. We claim that it is possible to introduce a scheduler to dataplanes, which routes requests to the appropriate core and effectively reduce the tail latency of the system while at the same time support a wider range of workloads

Infoscience - École polytechnique fédérale de Lausanne