40 research outputs found

    Energy aware approach for HPC systems

    Get PDF
    International audienceHigh‐performance computing (HPC) systems require energy during their full life cycle from design and production to transportation to usage and recycling/dismanteling. Because of increase of ecological and cost awareness, energy performance is now a primary focus. This chapter focuses on the usage aspect of HPC and how adapted and optimized software solutions could improve energy efficiency. It provides a detailed explanation of server power consumption, and discusses the application of HPC, phase detection, and phase identification. The chapter also suggests that having the load and memory access profiles is insufficient for an effective evaluation of the power consumed by an application. The available leverages in HPC systems are also shown in detail. The chapter proposes some solutions for modeling the power consumption of servers, which allows designing power prediction models for better decision making.These approaches allow the deployment and usage of a set of available green leverages, permitting energy reduction

    Proceedings of the Sixth International Workshop on Web Caching and Content Distribution

    Full text link
    OVERVIEW The International Web Content Caching and Distribution Workshop (WCW) is a premiere technical meeting for researchers and practitioners interested in all aspects of content caching, distribution and delivery on the Internet. The 2001 WCW meeting was held on the Boston University Campus. Building on the successes of the five previous WCW meetings, WCW01 featured a strong technical program and record participation from leading researchers and practitioners in the field. This report includes all the technical papers presented at WCW'01. Note: Proceedings of WCW'01 are published by Elsevier. Hardcopies of these proceedings can be purchased through the workshop organizers. As a service to the community, electronic copies of all WCW'01 papers are accessible through Technical Report BUCS‐TR‐2001‐017, available from the Boston University Computer Science Technical Report Archives at http://www.cs.bu.edu/techreps. [Ed.note: URL outdated. Use http://www.bu.edu/cs/research/technical-reports/ or http://hdl.handle.net/2144/1455 in this repository to access the reports.]Cisco Systems; InfoLibria; Measurement Factory Inc; Voler

    Optimizing TCP Forwarding

    No full text
    Abstract—The continued growth of the web places ever increasing performance demands on web site front-end appliances. In many cases, these appliances have to forward network traffic to and from web servers at transport and application levels utilizing complete TCP/IP stack processing, which could easily make the front-end appliance a bottleneck for a web site. This paper describes four novel optimizations of the TCP/IP stack processing for a TCP forwarding appliance: acknowledgement aggregation, fast path for incoming packets, double allocation avoidance in TCP module, and packet reuse. These optimizations are applicable for different throughputs and MTU sizes on the forwarding path when traditional approaches for performance improvements, such as TCP splicing, could not be applied. These optimizations are implemented in context of a web booster appliance that modifies web site traffic to decrease the cost of the network processing on a web server. These four optimizations, applied together with device driver polling, result in four-fold improvement in appliance throughput compared to a base case. I

    Web Server Performance in a WAN Environment

    No full text
    Abstract—This work analyzes web server performance under simulated WAN conditions. The workload simulates many critical network characteristics, such as network delay, bandwidth limit for client connections, and small MTU sizes to dial-up clients. A novel aspect of this study is the examination of the internal behavior of the web server at the network protocol stack and the device driver. Many known server design optimizations for performance improvement were evaluated in this simulated environment. We discovered that WAN network characteristics may significantly change the behavior of the web server compared to the LAN-based simulations and make many of optimizations of the server design irrelevant in this environment. Particularly, we found out that small MTU size of the dial-up user connections can increase the processing overhead several times. At the same time, the network delay, connection bandwidth limit, and usage of HTTP/1.1 persistent connections do not have a significant effect on the server performance. We have found there is little benefit due to copy and checksum avoidance, optimization of request concurrency management, and connection open/close avoidance under a workload with small MTU sizes, which is common for dial-up users. I

    Distributed Filaments: Efficient Fine-Grain Parallelism on a Cluster of Workstations

    No full text
    A fine-grain parallel program is one in which processes are typically small, ranging from a few to a few hundred instructions. Fine-grain parallelism arises naturally in many situations, such as iterative grid computations, recursive fork/join programs, the bodies of parallel FOR loops, and the implicit parallelism in functional or dataflow languages. It is useful both to describe massively parallel computations and as a target for code generation by compilers. However, fine-grain parallelism has long been thought to be inefficient due to the overheads of process creation, context switching, and synchronization. This paper describes a software kernel, Distributed Filaments (DF), that implements fine-grain parallelism both portably and efficiently on a workstation cluster. DF runs on existing, off-the-shelf hardware and software. It has a simple interface, so it is easy to use. DF achieves efficiency by using stateless threads on each node, overlapping communication and computation, emp..
    corecore