Search CORE

5 research outputs found

Using Fine-Grained Cycle Stealing to Improve Throughput, Efficiency and Response Time on a Dedicated Cluster while Maintaining Quality of Service

Author: Stiehr Gary
Publication venue: Washington University Open Scholarship
Publication date: 01/12/2004
Field of study

For various reasons, a dedicated cluster is not always fully utilized even when all of its processors are allocated to jobs. This occurs any time that a running job does not use 100% of each of the processors allocated to it. Keeping in mind the needs of both the cluster’s system administrators and its users, we would like to increase the throughput and efficiency of the cluster while maintaining or improving the average turnaround time of the jobs and the quality of service of the “primary” jobs originally scheduled on the cluster. To increase the throughput and efficiency of the cluster, we schedule background jobs to run concurrently with the primary jobs. However, to achieve our goal of maintaining or improving the average turnaround time of the jobs and the quality of service of the primary jobs, we investigate two methods of prioritizing the CPU usage of the primary and background jobs. The first method uses the existing “nice” mechanism in the 2.4 Linux kernel to give background processes a lower priority than primary processes. The second method involves modifying the 2.4 Linux kernel’s CPU scheduler to create a new guest process priority that prevents guest processes from running when primary processes are runnable. Our results come from empirical investigations using real production applications. Production runs using these applications are regularly performed in the dedicated cluster environment that we used for testing. Measurements of various statistics, such as wall time and CPU time, are taken directly from test runs that use these same production applications. This was helpful for comparison to results from models and synthetic applications. We found that using the existing nice mechanism significantly improves the throughput, efficiency and average turnaround time of the cluster but only at the expense of the quality of service of the primary jobs (primary job running times increased 5-25%). On the other hand, we can use the guest process priority to get similar improvements in throughput, efficiency and average turnaround time while not significantly impacting the quality of service of the primary jobs (primary job running times changed less than 1%)

Washington University St. Louis: Open Scholarship

Scientific Application Requirements for Leadership Computing at the Exascale

Author: Ahern Sean
Alam Sadaf R.
Barrett Richard F.
Fahey Mark R.
Hartman-Baker Rebecca J.
Kendall Ricky A.
Kothe Douglas B.
Mills Richard T.
Sankaran Ramanan
Tharrington Arnold N.
White James B., III
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 01/12/2007
Field of study

The Department of Energy s Leadership Computing Facility, located at Oak Ridge National Laboratory s National Center for Computational Sciences, recently polled scientific teams that had large allocations at the center in 2007, asking them to identify computational science requirements for future exascale systems (capable of an exaflop, or 1018 floating point operations per second). These requirements are necessarily speculative, since an exascale system will not be realized until the 2015 2020 timeframe, and are expressed where possible relative to a recent petascale requirements analysis of similar science applications [1]. Our initial findings, which beg further data collection, validation, and analysis, did in fact align with many of our expectations and existing petascale requirements, yet they also contained some surprises, complete with new challenges and opportunities. First and foremost, the breadth and depth of science prospects and benefits on an exascale computing system are striking. Without a doubt, they justify a large investment, even with its inherent risks. The possibilities for return on investment (by any measure) are too large to let us ignore this opportunity. The software opportunities and challenges are enormous. In fact, as one notable computational scientist put it, the scale of questions being asked at the exascale is tremendous and the hardware has gotten way ahead of the software. We are in grave danger of failing because of a software crisis unless concerted investments and coordinating activities are undertaken to reduce and close this hardwaresoftware gap over the next decade. Key to success will be a rigorous requirement for natural mapping of algorithms to hardware in a way that complements (rather than competes with) compilers and runtime systems. The level of abstraction must be raised, and more attention must be paid to functionalities and capabilities that incorporate intent into data structures, are aware of memory hierarchy, possess fault tolerance, exploit asynchronism, and are power-consumption aware. On the other hand, we must also provide application scientists with the ability to develop software without having to become experts in the computer science components. Numerical algorithms are scattered broadly across science domains, with no one particular algorithm being ubiquitous and no one algorithm going unused. Structured grids and dense linear algebra continue to dominate, but other algorithm categories will become more common. A significant increase is projected for Monte Carlo algorithms, unstructured grids, sparse linear algebra, and particle methods, and a relative decrease foreseen in fast Fourier transforms. These projections reflect the expectation of much higher architecture concurrency and the resulting need for very high scalability. The new algorithm categories that application scientists expect to be increasingly important in the next decade include adaptive mesh refinement, implicit nonlinear systems, data assimilation, agent-based methods, parameter continuation, and optimization. The attributes of leadership computing systems expected to increase most in priority over the next decade are (in order of importance) interconnect bandwidth, memory bandwidth, mean time to interrupt, memory latency, and interconnect latency. The attributes expected to decrease most in relative priority are disk latency, archival storage capacity, disk bandwidth, wide area network bandwidth, and local storage capacity. These choices by application developers reflect the expected needs of applications or the expected reality of available hardware. One interpretation is that the increasing priorities reflect the desire to increase computational efficiency to take advantage of increasing peak flops [floating point operations per second], while the decreasing priorities reflect the expectation that computational efficiency will not increase. Per-core requirements appear to be relatively static, while aggregate requirements will grow with the system. This projection is consistent with a relatively small increase in performance per core with a dramatic increase in the number of cores. Leadership system software must face and overcome issues that will undoubtedly be exacerbated at the exascale. The operating system (OS) must be as unobtrusive as possible and possess more stability, reliability, and fault tolerance during application execution. As applications will be more likely at the exascale to experience loss of resources during an execution, the OS must mitigate such a loss with a range of responses. New fault tolerance paradigms must be developed and integrated into applications. Just as application input and output must not be an afterthought in hardware design, job management, too, must not be an afterthought in system software design. Efficient scheduling of those resources will be a major obstacle faced by leadership computing centers at the exas..

Crossref

UNT Digital Library

Faculty Publications and Creative Works 2003

Author: Office of the Vice President for Research
Publication venue: UNM Digital Repository
Publication date: 01/01/2003
Field of study

Faculty Publications & Creative Works is an annual compendium of scholarly and creative activities of University of New Mexico faculty during the noted calendar year. It serves to illustrate the robust and active intellectual pursuits conducted by the faculty in support of teaching and research at UNM

A Performance Comparison of Linux and a Lightweight Kernel

Author: Keith Underwood
Rolf Riesen
Ron Brightwell
Trammell B. Hudson
Publication venue: IEEE Computer Society Press
Publication date: 01/01/2003
Field of study

In this paper, we compare running the Linux operating system on the compute nodes of ASCI Red hardware to running a specialized, highly-optimized lightweight kernel (LWK) operating system. We have ported Linux to the compute and service nodes of the ASCI Red supercomputer, and have run several benchmarks. We present performance and scalability results for Linux compared with the LWK environment. To our knowledge, this is the first direct comparison on identical hardware of Linux and an operating system designed specifically for large-scale supercomputers. In addition to presenting these results, we will discuss the limitations of both operating systems, in terms of the empirical evidence as well as other important factors

CiteSeerX