286 research outputs found

    Survey and Analysis of Production Distributed Computing Infrastructures

    Full text link
    This report has two objectives. First, we describe a set of the production distributed infrastructures currently available, so that the reader has a basic understanding of them. This includes explaining why each infrastructure was created and made available and how it has succeeded and failed. The set is not complete, but we believe it is representative. Second, we describe the infrastructures in terms of their use, which is a combination of how they were designed to be used and how users have found ways to use them. Applications are often designed and created with specific infrastructures in mind, with both an appreciation of the existing capabilities provided by those infrastructures and an anticipation of their future capabilities. Here, the infrastructures we discuss were often designed and created with specific applications in mind, or at least specific types of applications. The reader should understand how the interplay between the infrastructure providers and the users leads to such usages, which we call usage modalities. These usage modalities are really abstractions that exist between the infrastructures and the applications; they influence the infrastructures by representing the applications, and they influence the ap- plications by representing the infrastructures

    Montage: a grid portal and software toolkit for science-grade astronomical image mosaicking

    Full text link
    Montage is a portable software toolkit for constructing custom, science-grade mosaics by composing multiple astronomical images. The mosaics constructed by Montage preserve the astrometry (position) and photometry (intensity) of the sources in the input images. The mosaic to be constructed is specified by the user in terms of a set of parameters, including dataset and wavelength to be used, location and size on the sky, coordinate system and projection, and spatial sampling rate. Many astronomical datasets are massive, and are stored in distributed archives that are, in most cases, remote with respect to the available computational resources. Montage can be run on both single- and multi-processor computers, including clusters and grids. Standard grid tools are used to run Montage in the case where the data or computers used to construct a mosaic are located remotely on the Internet. This paper describes the architecture, algorithms, and usage of Montage as both a software toolkit and as a grid portal. Timing results are provided to show how Montage performance scales with number of processors on a cluster computer. In addition, we compare the performance of two methods of running Montage in parallel on a grid.Comment: 16 pages, 11 figure

    Many-Task Computing and Blue Waters

    Full text link
    This report discusses many-task computing (MTC) generically and in the context of the proposed Blue Waters systems, which is planned to be the largest NSF-funded supercomputer when it begins production use in 2012. The aim of this report is to inform the BW project about MTC, including understanding aspects of MTC applications that can be used to characterize the domain and understanding the implications of these aspects to middleware and policies. Many MTC applications do not neatly fit the stereotypes of high-performance computing (HPC) or high-throughput computing (HTC) applications. Like HTC applications, by definition MTC applications are structured as graphs of discrete tasks, with explicit input and output dependencies forming the graph edges. However, MTC applications have significant features that distinguish them from typical HTC applications. In particular, different engineering constraints for hardware and software must be met in order to support these applications. HTC applications have traditionally run on platforms such as grids and clusters, through either workflow systems or parallel programming systems. MTC applications, in contrast, will often demand a short time to solution, may be communication intensive or data intensive, and may comprise very short tasks. Therefore, hardware and software for MTC must be engineered to support the additional communication and I/O and must minimize task dispatch overheads. The hardware of large-scale HPC systems, with its high degree of parallelism and support for intensive communication, is well suited for MTC applications. However, HPC systems often lack a dynamic resource-provisioning feature, are not ideal for task communication via the file system, and have an I/O system that is not optimized for MTC-style applications. Hence, additional software support is likely to be required to gain full benefit from the HPC hardware

    Developing a Coherent Cyberinfrastructure from Local Campus to National Facilities: Challenges and Strategies

    Get PDF
    A fundamental goal of cyberinfrastructure (CI) is the integration of computing hardware, software, and network technology, along with data, information management, and human resources to advance scholarship and research. Such integration creates opportunities for researchers, educators, and learners to share ideas, expertise, tools, and facilities in new and powerful ways that cannot be realized if each of these components is applied independently. Bridging the gap between the reality of CI today and its potential in the immediate future is critical to building a balanced CI ecosystem that can support future scholarship and research. This report summarizes the observations and recommendations from a workshop in July 2008 sponsored by the EDUCAUSE Net@EDU Campus Cyberinfrastructure Working Group (CCI) and the Coalition for Academic Scientific Computation (CASC). The invitational workshop was hosted at the University Place Conference Center on the IUPUI campus in Indianapolis. Over 50 individuals representing a cross-section of faculty, senior campus information technology leaders, national lab directors, and other CI experts attended. The workshop focused on the challenges that must be addressed to build a coherent CI from the local to the national level, and the potential opportunities that would result. Both the organizing committee and the workshop participants hope that some of the ideas, suggestions, and recommendations in this report will take hold and be implemented in the community. The goal is to create a better, more supportive, more usable CI environment in the future to advance both scholarship and research

    A Multilevel Approach to Topology-Aware Collective Operations in Computational Grids

    Full text link
    The efficient implementation of collective communiction operations has received much attention. Initial efforts produced "optimal" trees based on network communication models that assumed equal point-to-point latencies between any two processes. This assumption is violated in most practical settings, however, particularly in heterogeneous systems such as clusters of SMPs and wide-area "computational Grids," with the result that collective operations perform suboptimally. In response, more recent work has focused on creating topology-aware trees for collective operations that minimize communication across slower channels (e.g., a wide-area network). While these efforts have significant communication benefits, they all limit their view of the network to only two layers. We present a strategy based upon a multilayer view of the network. By creating multilevel topology-aware trees we take advantage of communication cost differences at every level in the network. We used this strategy to implement topology-aware versions of several MPI collective operations in MPICH-G2, the Globus Toolkit[tm]-enabled version of the popular MPICH implementation of the MPI standard. Using information about topology provided by MPICH-G2, we construct these multilevel topology-aware trees automatically during execution. We present results demonstrating the advantages of our multilevel approach by comparing it to the default (topology-unaware) implementation provided by MPICH and a topology-aware two-layer implementation.Comment: 16 pages, 8 figure

    National Science Foundation Advisory Committee for Cyberinfrastructure Task Force on Campus Bridging Final Report

    Get PDF
    The mission of the National Science Foundation (NSF) Advisory Committee on Cyberinfrastructure (ACCI) is to advise the NSF as a whole on matters related to vision and strategy regarding cyberinfrastructure (CI). In early 2009 the ACCI charged six task forces with making recommendations to the NSF in strategic areas of cyberinfrastructure: Campus Bridging; Cyberlearning and Workforce Development; Data and Visualization; Grand Challenges; High Performance Computing (HPC); and Software for Science and Engineering. Each task force was asked to offer advice on the basis of which the NSF would modify existing programs and create new programs. This document is the final, overall report of the Task Force on Campus Bridging.National Science Foundatio

    I-Light Symposium 2005 Proceedings

    Get PDF
    I-Light was made possible by a special appropriation by the State of Indiana. The research described at the I-Light Symposium has been supported by numerous grants from several sources. Any opinions, findings and conclusions, or recommendations expressed in the 2005 I-Light Symposium Proceedings are those of the researchers and authors and do not necessarily reflect the views of the granting agencies.Indiana University Office of the Vice President for Research and Information Technology, Purdue University Office of the Vice President for Information Technology and CI

    Metascheduling of HPC Jobs in Day-Ahead Electricity Markets

    Full text link
    High performance grid computing is a key enabler of large scale collaborative computational science. With the promise of exascale computing, high performance grid systems are expected to incur electricity bills that grow super-linearly over time. In order to achieve cost effectiveness in these systems, it is essential for the scheduling algorithms to exploit electricity price variations, both in space and time, that are prevalent in the dynamic electricity price markets. In this paper, we present a metascheduling algorithm to optimize the placement of jobs in a compute grid which consumes electricity from the day-ahead wholesale market. We formulate the scheduling problem as a Minimum Cost Maximum Flow problem and leverage queue waiting time and electricity price predictions to accurately estimate the cost of job execution at a system. Using trace based simulation with real and synthetic workload traces, and real electricity price data sets, we demonstrate our approach on two currently operational grids, XSEDE and NorduGrid. Our experimental setup collectively constitute more than 433K processors spread across 58 compute systems in 17 geographically distributed locations. Experiments show that our approach simultaneously optimizes the total electricity cost and the average response time of the grid, without being unfair to users of the local batch systems.Comment: Appears in IEEE Transactions on Parallel and Distributed System
    corecore