15 research outputs found

    Xar-Trek: Run-Time Execution Migration among FPGAs and Heterogeneous-ISA CPUs

    Get PDF
    Datacenter servers are increasingly heterogeneous: from x86 host CPUs, to ARM or RISC-V CPUs in NICs/SSDs, to FPGAs. Previous works have demonstrated that migrating application execution at run-time across heterogeneous-ISA CPUs can yield significant performance and energy gains, with relatively little programmer effort. However, FPGAs have often been overlooked in that context: hardware acceleration using FPGAs involves statically implementing select application functions, which prohibits dynamic and transparent migration. We present Xar-Trek, a new compiler and run-time software framework that overcomes this limitation. Xar-Trek compiles an application for several CPU ISAs and select application functions for acceleration on an FPGA, allowing execution migration between heterogeneous-ISA CPUs and FPGAs at run-time. Xar-Trek's run-time monitors server workloads and migrates application functions to an FPGA or to heterogeneous-ISA CPUs based on a scheduling policy. We develop a heuristic policy that uses application workload profiles to make scheduling decisions. Our evaluations conducted on a system with x86-64 server CPUs, ARM64 server CPUs, and an Alveo accelerator card reveal 88%-1% performance gains over no-migration baselines

    Underwater computing systems and astronomy–multi-disciplinary research potential and benefits

    Get PDF
    Abstract: The Ocean plays an important role in hosting investigations in underwater astronomy and enabling the realization of new research prospects. This paper discusses synergistic prospects of the blue economy from the perspective of underwater astronomy and scientific inquiry, technological and economic development. The presented research investigates how the synergy enhances computing applications. The paper presents the overloaded application paradigm that explores the ability of underwater telescope networks to accommodate additional applications. The investigated metrics for computing applications are the accessible computational resources, and power usage effectiveness (PUE) that is investigated in a scenario where onshore computing stations used in underwater astronomy observations are integrated with existing terrestrial data centers. This is necessary as onshore computing stations benefit from free cooling being located near natural maritime resources. The performance evaluation investigates how the proposed synergy and the emerging crowd–sourcing can enhance the observation resolution for underwater astronomy observations. Investigation shows that the synergy enhances accessible computational resources, PUE and observation resolution by an average of 48.8%, 1.6% and 41.3%, respectively

    Toward Energy Efficient Systems Design For Data Centers

    Get PDF
    Surge growth of numerous cloud services, Internet of Things, and edge computing promotes continuous increasing demand for data centers worldwide. Significant electricity consumption of data centers has tremendous implications on both operating and capital expense. The power infrastructure, along with the cooling system cost a multi-million or even billion dollar project to add new data center capacities. Given the high cost of large-scale data centers, it is important to fully utilize the capacity of data centers to reduce the Total Cost of Ownership. The data center is designed with a space budget and power budget. With the adoption of high-density rack designs, the capacity of a modern data center is usually limited by the power budget. So the core of the challenge is scaling up power infrastructure capacity. However, resizing the initial power capacity for an existing data center can be a task as difficult as building a new data center because of a non-scalable centralized power provisioning scheme. Thus, how to maximize the power utilization and optimize the performance per power budget is critical for data centers to deliver enough computation ability. To explore and attack the challenges of improving the power utilization, we have planned to work on different levels of data center, including server level, row level, and data center level. For server level, we take advantage of modern hardware to maximize power efficiency of each server. For rack level, we propose Pelican, a new power scheduling system for large-scale data centers with heterogeneous workloads. For row level, we present Ampere, a new approach to improve throughput per watt by provisioning extra servers. By combining these studies on different levels, we will provide comprehensive energy efficient system designs for data center

    Towards Power- and Energy-Efficient Datacenters

    Full text link
    As the Internet evolves, cloud computing is now a dominant form of computation in modern lives. Warehouse-scale computers (WSCs), or datacenters, comprising the foundation of this cloud-centric web have been able to deliver satisfactory performance to both the Internet companies and the customers. With the increased focus and popularity of the cloud, however, datacenter loads rise and grow rapidly, and Internet companies are in need of boosted computing capacity to serve such demand. Unfortunately, power and energy are often the major limiting factors prohibiting datacenter growth: it is often the case that no more servers can be added to datacenters without surpassing the capacity of the existing power infrastructure. This dissertation aims to investigate the issues of power and energy usage in a modern datacenter environment. We identify the source of power and energy inefficiency at three levels in a modern datacenter environment and provides insights and solutions to address each of these problems, aiming to prepare datacenters for critical future growth. We start at the datacenter-level and find that the peak provisioning and improper service placement in multi-level power delivery infrastructures fragment the power budget inside production datacenters, degrading the compute capacity the existing infrastructure can support. We find that the heterogeneity among datacenter workloads is key to address this issue and design systematic methods to reduce the fragmentation and improve the utilization of the power budget. This dissertation then narrow the focus to examine the energy usage of individual servers running cloud workloads. Especially, we examine the power management mechanisms employed in these servers and find that the coarse time granularity of these mechanisms is one critical factor that leads to excessive energy consumption. We propose an intelligent and low overhead solution on top of the emerging finer granularity voltage/frequency boosting circuit to effectively pinpoints and boosts queries that are likely to increase the tail distribution and can reap more benefit from the voltage/frequency boost, improving energy efficiency without sacrificing the quality of services. The final focus of this dissertation takes a further step to investigate how using a fundamentally more efficient computing substrate, field programmable gate arrays (FPGAs), benefit datacenter power and energy efficiency. Different from other types of hardware accelerations, FPGAs can be reconfigured on-the-fly to provide fine-grain control over hardware resource allocation and presents a unique set of challenges for optimal workload scheduling and resource allocation. We aim to design a set coordinated algorithms to manage these two key factors simultaneously and fully explore the benefit of deploying FPGAs in the highly varying cloud environment.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/144043/1/hsuch_1.pd

    Memory Systems and Interconnects for Scale-Out Servers

    Get PDF
    The information revolution of the last decade has been fueled by the digitization of almost all human activities through a wide range of Internet services. The backbone of this information age are scale-out datacenters that need to collect, store, and process massive amounts of data. These datacenters distribute vast datasets across a large number of servers, typically into memory-resident shards so as to maintain strict quality-of-service guarantees. While data is driving the skyrocketing demands for scale-out servers, processor and memory manufacturers have reached fundamental efficiency limits, no longer able to increase server energy efficiency at a sufficient pace. As a result, energy has emerged as the main obstacle to the scalability of information technology (IT) with huge economic implications. Delivering sustainable IT calls for a paradigm shift in computer system design. As memory has taken a central role in IT infrastructure, memory-centric architectures are required to fully utilize the IT's costly memory investment. In response, processor architects are resorting to manycore architectures to leverage the abundant request-level parallelism found in data-centric applications. Manycore processors fully utilize available memory resources, thereby increasing IT efficiency by almost an order of magnitude. Because manycore server chips execute a large number of concurrent requests, they exhibit high incidence of accesses to the last-level-cache for fetching instructions (due to large instruction footprints), and off-chip memory (due to lack of temporal reuse in on-chip caches) for accessing dataset objects. As a result, on-chip interconnects and the memory system are emerging as major performance and energy-efficiency bottlenecks in servers. This thesis seeks to architect on-chip interconnects and memory systems that are tuned for the requirements of memory-centric scale-out servers. By studying a wide range of data-centric applications, we uncover application phenomena common in data-centric applications, and examine their implications on on-chip network and off-chip memory traffic. Finally, we propose specialized on-chip interconnects and memory systems that leverage common traffic characteristics, thereby improving server throughput and energy efficiency
    corecore