140 research outputs found

    Jigsaw: Scalable Software-Defined Caches (Extended Version)

    Get PDF
    Shared last-level caches, widely used in chip-multiprocessors (CMPs), face two fundamental limitations. First, the latency and energy of shared caches degrade as the system scales up. Second, when multiple workloads share the CMP, they suffer from interference in shared cache accesses. Unfortunately, prior research addressing one issue either ignores or worsens the other: NUCA techniques reduce access latency but are prone to hotspots and interference, and cache partitioning techniques only provide isolation but do not reduce access latency. We present Jigsaw, a technique that jointly addresses the scalability and interference problems of shared caches. Hardware lets software define shares, collections of cache bank partitions that act as virtual caches, and map data to shares. Shares give software full control over both data placement and capacity allocation. Jigsaw implements efficient hardware support for share management, monitoring, and adaptation. We propose novel resource-management algorithms and use them to develop a system-level runtime that leverages Jigsaw to both maximize cache utilization and place data close to where it is used. We evaluate Jigsaw using extensive simulations of 16- and 64-core tiled CMPs. Jigsaw improves performance by up to 2.2x (18% avg) over a conventional shared cache, and significantly outperforms state-of-the-art NUCA and partitioning techniques.This work was supported in part by DARPA PERFECT contract HR0011-13-2-0005 and Quanta Computer

    On the Design of Real-Time Systems on Multi-Core Platforms under Uncertainty

    Get PDF
    Real-time systems are computing systems that demand the assurance of not only the logical correctness of computational results but also the timing of these results. To ensure timing constraints, traditional real-time system designs usually adopt a worst-case based deterministic approach. However, such an approach is becoming out of sync with the continuous evolution of IC technology and increased complexity of real-time applications. As IC technology continues to evolve into the deep sub-micron domain, process variation causes processor performance to vary from die to die, chip to chip, and even core to core. The extensive resource sharing on multi-core platforms also significantly increases the uncertainty when executing real-time tasks. The traditional approach can only lead to extremely pessimistic, and thus, unpractical design of real-time systems. Our research seeks to address the uncertainty problem when designing real-time systems on multi-core platforms. We first attacked the uncertainty problem caused by process variation. We proposed a virtualization framework and developed techniques to optimize the system\u27s performance under process variation. We further studied the problem on peak temperature minimization for real-time applications on multi-core platforms. Three heuristics were developed to reduce the peak temperature for real-time systems. Next, we sought to address the uncertainty problem in real-time task execution times by developing statistical real-time scheduling techniques. We studied the problem of fixed-priority real-time scheduling of implicit periodic tasks with probabilistic execution times on multi-core platforms. We further extended our research for tasks with explicit deadlines. We introduced the concept of harmonic to a more general task set, i.e. tasks with explicit deadlines, and developed new task partitioning techniques. Throughout our research, we have conducted extensive simulations to study the effectiveness and efficiency of our developed techniques. The increasing process variation and the ever-increasing scale and complexity of real-time systems both demand a paradigm shift in the design of real-time applications. Effectively dealing with the uncertainty in design of real-time applications is a challenging but also critical problem. Our research is such an effort in this endeavor, and we conclude this dissertation with discussions of potential future work

    A policy-based architecture for virtual network embedding

    Full text link
    Network virtualization is a technology that enables multiple virtual instances to coexist on a common physical network infrastructure. This paradigm fostered new business models, allowing infrastructure providers to lease or share their physical resources. Each virtual network is isolated and can be customized to support a new class of customers and applications. To this end, infrastructure providers need to embed virtual networks on their infrastructure. The virtual network embedding is the (NP-hard) problem of matching constrained virtual networks onto a physical network. Heuristics to solve the embedding problem have exploited several policies under different settings. For example, centralized solutions have been devised for small enterprise physical networks, while distributed solutions have been proposed over larger federated wide-area networks. In this thesis we present a policy-based architecture for the virtual network embedding problem. By policy, we mean a variant aspect of any of the three (invariant) embedding mechanisms: physical resource discovery, virtual network mapping, and allocation on the physical infrastructure. Our architecture adapts to different scenarios by instantiating appropriate policies, and has bounds on embedding efficiency, and on convergence embedding time, over a single provider, or across multiple federated providers. The performance of representative novel and existing policy configurations are compared via extensive simulations, and over a prototype implementation. We also present an object model as a foundation for a protocol specification, and we release a testbed to enable users to test their own embedding policies, and to run applications within their virtual networks. The testbed uses a Linux system architecture to reserve virtual node and link capacities

    Thermal-Aware Networked Many-Core Systems

    Get PDF
    Advancements in IC processing technology has led to the innovation and growth happening in the consumer electronics sector and the evolution of the IT infrastructure supporting this exponential growth. One of the most difficult obstacles to this growth is the removal of large amount of heatgenerated by the processing and communicating nodes on the system. The scaling down of technology and the increase in power density is posing a direct and consequential effect on the rise in temperature. This has resulted in the increase in cooling budgets, and affects both the life-time reliability and performance of the system. Hence, reducing on-chip temperatures has become a major design concern for modern microprocessors. This dissertation addresses the thermal challenges at different levels for both 2D planer and 3D stacked systems. It proposes a self-timed thermal monitoring strategy based on the liberal use of on-chip thermal sensors. This makes use of noise variation tolerant and leakage current based thermal sensing for monitoring purposes. In order to study thermal management issues from early design stages, accurate thermal modeling and analysis at design time is essential. In this regard, spatial temperature profile of the global Cu nanowire for on-chip interconnects has been analyzed. It presents a 3D thermal model of a multicore system in order to investigate the effects of hotspots and the placement of silicon die layers, on the thermal performance of a modern ip-chip package. For a 3D stacked system, the primary design goal is to maximise the performance within the given power and thermal envelopes. Hence, a thermally efficient routing strategy for 3D NoC-Bus hybrid architectures has been proposed to mitigate on-chip temperatures by herding most of the switching activity to the die which is closer to heat sink. Finally, an exploration of various thermal-aware placement approaches for both the 2D and 3D stacked systems has been presented. Various thermal models have been developed and thermal control metrics have been extracted. An efficient thermal-aware application mapping algorithm for a 2D NoC has been presented. It has been shown that the proposed mapping algorithm reduces the effective area reeling under high temperatures when compared to the state of the art.Siirretty Doriast

    Structural issues and energy efficiency in data centers

    Get PDF
    Mención Internacional en el título de doctorWith the rise of cloud computing, data centers have been called to play a main role in the Internet scenario nowadays. Despite this relevance, they are probably far from their zenith yet due to the ever increasing demand of contents to be stored in and distributed by the cloud, the need of computing power or the larger and larger amounts of data being analyzed by top companies such as Google, Microsoft or Amazon. However, everything is not always a bed of roses. Having a data center entails two major issues: they are terribly expensive to build, and they consume huge amounts of power being, therefore, terribly expensive to maintain. For this reason, cutting down the cost of building and increasing the energy efficiency (and hence reducing the carbon footprint) of data centers has been one of the hottest research topics during the last years. In this thesis we propose different techniques that can have an impact in both the building and the maintenance costs of data centers of any size, from small scale to large flagship data centers. The first part of the thesis is devoted to structural issues. We start by analyzing the bisection (band)width of a topology, of product graphs in particular, a useful parameter to compare and choose among different data center topologies. In that same part we describe the problem of deploying the servers in a data center as a Multidimensional Arrangement Problem (MAP) and propose a heuristic to reduce the deployment and wiring costs. We target energy efficiency in data centers in the second part of the thesis. We first propose a method to reduce the energy consumption in the data center network: rate adaptation. Rate adaptation is based on the idea of energy proportionality and aims to consume power on network devices proportionally to the load on their links. Our analysis proves that just using rate adaptation we may achieve average energy savings in the order of a 30-40% and up to a 60% depending on the network topology. We continue by characterizing the power requirements of a data center server given that, in order to properly increase the energy efficiency of a data center, we first need to understand how energy is being consumed. We present an exhaustive empirical characterization of the power requirements of multiple components of data center servers, namely, the CPU, the disks, and the network card. To do so, we devise different experiments to stress these components, taking into account the multiple available frequencies as well as the fact that we are working with multicore servers. In these experiments, we measure their energy consumption and identify their optimal operational points. Our study proves that the curve that defines the minimal power consumption of the CPU, as a function of the load in Active Cycles Per Second (ACPS), is neither concave nor purely convex. Moreover, it definitively has a superlinear dependence on the load. We also validate the accuracy of the model derived from our characterization by running different Hadoop applications in diverse scenarios obtaining an error below 4:1% on average. The last topic we study is the Virtual Machine Assignment problem (VMA), i.e., optimizing how virtual machines (VMs) are assigned to physical machines (PMs) in data centers. Our optimization target is to minimize the power consumed by all the PMs when considering that power consumption depends superlinearly on the load. We study four different VMA problems, depending on whether the number of PMs and their capacity are bounded or not. We study their complexity and perform an offline and online analysis of these problems. The online analysis is complemented with simulations that show that the online algorithms we propose consume substantially less power than other state of the art assignment algorithms.Programa Oficial de Doctorado en Ingeniería TelemáticaPresidente: Joerg Widmer.- Secretario: José Manuel Moya Fernández.- Vocal: Shmuel Zak
    corecore