7 research outputs found
Recommended from our members
The Grand Challenge of Managing the Petascale Facility.
This report is the result of a study of networks and how they may need to evolve to support petascale leadership computing and science. As Dr. Ray Orbach, director of the Department of Energy's Office of Science, says in the spring 2006 issue of SciDAC Review, 'One remarkable example of growth in unexpected directions has been in high-end computation'. In the same article Dr. Michael Strayer states, 'Moore's law suggests that before the end of the next cycle of SciDAC, we shall see petaflop computers'. Given the Office of Science's strong leadership and support for petascale computing and facilities, we should expect to see petaflop computers in operation in support of science before the end of the decade, and DOE/SC Advanced Scientific Computing Research programs are focused on making this a reality. This study took its lead from this strong focus on petascale computing and the networks required to support such facilities, but it grew to include almost all aspects of the DOE/SC petascale computational and experimental science facilities, all of which will face daunting challenges in managing and analyzing the voluminous amounts of data expected. In addition, trends indicate the increased coupling of unique experimental facilities with computational facilities, along with the integration of multidisciplinary datasets and high-end computing with data-intensive computing; and we can expect these trends to continue at the petascale level and beyond. Coupled with recent technology trends, they clearly indicate the need for including capability petascale storage, networks, and experiments, as well as collaboration tools and programming environments, as integral components of the Office of Science's petascale capability metafacility. The objective of this report is to recommend a new cross-cutting program to support the management of petascale science and infrastructure. The appendices of the report document current and projected DOE computation facilities, science trends, and technology trends, whose combined impact can affect the manageability and stewardship of DOE's petascale facilities. This report is not meant to be all-inclusive. Rather, the facilities, science projects, and research topics presented are to be considered examples to clarify a point
Grid scheduling veneer
In the e-Science community, three local resource management systems are commonly used: Condor, OpenPBS and Sun GridEngine. E-Scientists must determine which system is being used to allow them to select which execution site a job should be dispatched to. If they wish to run their applications on an alternative local resource management system, they must rewrite the job description and submit it in a different way. The Grid Scheduling Veneer (GSV) is a middleware solution which provides a single job description language to support a subset of features exhibited by Condor, OpenPBS and SGE. The GSV directs a received job to any of these systems available, based upon current load and job attributes. It also provides additional functionality over each local system. In this dissertation, we discuss the motivation for the GSV project. Subsequently, we describe the methodology and implementation of the system. To test the system, we have designed and deployed a three stage experiment. The experimental results show that the GSV fulfils the requirements and goals of the project
Efficient I/O for Computational Grid Applications
High-performance computing increasingly occurs on computational grids composed of heterogeneous and geographically distributed systems of computers, networks, and storage devices that collectively act as a single virtual computer. A key challenge in this environment is to provide efficient access to data distributed across remote data servers. This dissertation explores some of the issues associated with I/O for wide-area distributed computing and describes an I/O system, called Armada, with the following features: a framework to allow application and dataset providers to flexibly compose graphs of processing modules that describe the distribution, application interfaces, and processing required of the dataset before or after computation; an algorithm to restructure application graphs to increase parallelism and to improve network performance in a wide-area network; and a hierarchical graph-partitioning scheme that deploys components of the application graph in a way that is both beneficial to the application and sensitive to the administrative policies of the different administrative domains. Experiments show that applications using Armada perform well in both low- and high-bandwidth environments, and that our approach does an exceptional job of hiding the network latency inherent in grid computing
A Policy-Based Resource Brokering Environment for Computational Grids
With the advances in networking infrastructure in general, and the Internet in particular, we can build grid environments that allow users to utilize a diverse set of distributed and heterogeneous resources. Since the focus of such environments is the efficient usage of the underlying resources, a critical component is the resource brokering environment that mediates the discovery, access and usage of these resources. With the consumer\u27s constraints, provider\u27s rules, distributed heterogeneous resources and the large number of scheduling choices, the resource brokering environment needs to decide where to place the user\u27s jobs and when to start their execution in a way that yields the best performance for the user and the best utilization for the resource provider.
As brokering and scheduling are very complicated tasks, most current resource brokering environments are either specific to a particular grid environment or have limited features. This makes them unsuitable for large applications with heterogeneous requirements. In addition, most of these resource brokering environments lack flexibility. Policies at the resource-, application-, and system-levels cannot be specified and enforced to provide commitment to the guaranteed level of allocation that can help in attracting grid users and contribute to establishing credibility for existing grid environments.
In this thesis, we propose and prototype a flexible and extensible Policy-based Resource Brokering Environment (PROBE) that can be utilized by various grid systems. In designing PROBE, we follow a policy-based approach that provides PROBE with the intelligence to not only match the user\u27s request with the right set of resources, but also to assure the guaranteed level of the allocation. PROBE looks at the task allocation as a Service Level Agreement (SLA) that needs to be enforced between the resource provider and the resource consumer. The policy-based framework is useful in a typical grid environment where resources, most of the time, are not dedicated. In implementing PROBE, we have utilized a layered architecture and façade design patterns. These along with the well-defined API, make the framework independent of any architecture and allow for the incorporation of different types of scheduling algorithms, applications and platform adaptors as the underlying environment requires. We have utilized XML as a base for all the specification needs. This provides a flexible mechanism to specify the heterogeneous resources and user\u27s requests along with their allocation constraints. We have developed XML-based specifications by which high-level internal structures of resources, jobs and policies can be specified. This provides interoperability in which a grid system can utilize PROBE to discover and use resources controlled by other grid systems.
We have implemented a prototype of PROBE to demonstrate its feasibility. We also describe a test bed environment and the evaluation experiments that we have conducted to demonstrate the usefulness and effectiveness of our approach