Search CORE

18,134 research outputs found

Towards Loosely-Coupled Programming on Petascale Systems

Author: Beckman Pete
Clifford Ben
Foster Ian
Iskra Kamil
Raicu Ioan
Wilde Mike
Zhang Zhao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/08/2008
Field of study

We have extended the Falkon lightweight task execution framework to make loosely coupled programming on petascale systems a practical and useful programming model. This work studies and measures the performance factors involved in applying this approach to enable the use of petascale systems by a broader user community, and with greater ease. Our work enables the execution of highly parallel computations composed of loosely coupled serial jobs with no modifications to the respective applications. This approach allows a new-and potentially far larger-class of applications to leverage petascale systems, such as the IBM Blue Gene/P supercomputer. We present the challenges of I/O performance encountered in making this model practical, and show results using both microbenchmarks and real applications from two domains: economic energy modeling and molecular dynamics. Our benchmarks show that we can scale up to 160K processor-cores with high efficiency, and can achieve sustained execution rates of thousands of tasks per second.Comment: IEEE/ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SuperComputing/SC) 200

arXiv.org e-Print Archive

Crossref

A Tale of Two Data-Intensive Paradigms: Applications, Abstractions, and Architectures

Author: Fox Geoffrey C.
Jha Shantenu
Luckow Andre
Mantha Pradeep
Qiu Judy
Publication venue
Publication date: 01/01/2014
Field of study

Scientific problems that depend on processing large amounts of data require overcoming challenges in multiple areas: managing large-scale data distribution, co-placement and scheduling of data with compute resources, and storing and transferring large volumes of data. We analyze the ecosystems of the two prominent paradigms for data-intensive applications, hereafter referred to as the high-performance computing and the Apache-Hadoop paradigm. We propose a basis, common terminology and functional factors upon which to analyze the two approaches of both paradigms. We discuss the concept of "Big Data Ogres" and their facets as means of understanding and characterizing the most common application workloads found across the two paradigms. We then discuss the salient features of the two paradigms, and compare and contrast the two approaches. Specifically, we examine common implementation/approaches of these paradigms, shed light upon the reasons for their current "architecture" and discuss some typical workloads that utilize them. In spite of the significant software distinctions, we believe there is architectural similarity. We discuss the potential integration of different implementations, across the different levels and components. Our comparison progresses from a fully qualitative examination of the two paradigms, to a semi-quantitative methodology. We use a simple and broadly used Ogre (K-means clustering), characterize its performance on a range of representative platforms, covering several implementations from both paradigms. Our experiments provide an insight into the relative strengths of the two paradigms. We propose that the set of Ogres will serve as a benchmark to evaluate the two paradigms along different dimensions.Comment: 8 pages, 2 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Design and Evaluation of a Collective IO Model for Loosely Coupled Petascale Programming

Author: Espinosa Allan
Foster Ian
Iskra Kamil
Raicu Ioan
Wilde Michael
Zhang Zhao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 31/12/2008
Field of study

Loosely coupled programming is a powerful paradigm for rapidly creating higher-level applications from scientific programs on petascale systems, typically using scripting languages. This paradigm is a form of many-task computing (MTC) which focuses on the passing of data between programs as ordinary files rather than messages. While it has the significant benefits of decoupling producer and consumer and allowing existing application programs to be executed in parallel with no recoding, its typical implementation using shared file systems places a high performance burden on the overall system and on the user who will analyze and consume the downstream data. Previous efforts have achieved great speedups with loosely coupled programs, but have done so with careful manual tuning of all shared file system access. In this work, we evaluate a prototype collective IO model for file-based MTC. The model enables efficient and easy distribution of input data files to computing nodes and gathering of output results from them. It eliminates the need for such manual tuning and makes the programming of large-scale clusters using a loosely coupled model easier. Our approach, inspired by in-memory approaches to collective operations for parallel programming, builds on fast local file systems to provide high-speed local file caches for parallel scripts, uses a broadcast approach to handle distribution of common input data, and uses efficient scatter/gather and caching techniques for input and output. We describe the design of the prototype model, its implementation on the Blue Gene/P supercomputer, and present preliminary measurements of its performance on synthetic benchmarks and on a large-scale molecular dynamics application.Comment: IEEE Many-Task Computing on Grids and Supercomputers (MTAGS08) 200

arXiv.org e-Print Archive

Crossref

A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

Author: Buyya Rajkumar
Ramamohanarao Kotagiri
Venugopal Srikumar
Publication venue
Publication date: 10/06/2005
Field of study

Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor

arXiv.org e-Print Archive

CiteSeerX

University of Melbourne Institutional Repository

A development framework for artificial intelligence based distributed operations support systems

Author: Adler Richard M.
Cottman Bruce H.
Publication venue
Publication date
Field of study

Advanced automation is required to reduce costly human operations support requirements for complex space-based and ground control systems. Existing knowledge based technologies have been used successfully to automate individual operations tasks. Considerably less progress has been made in integrating and coordinating multiple operations applications for unified intelligent support systems. To fill this gap, SOCIAL, a tool set for developing Distributed Artificial Intelligence (DAI) systems is being constructed. SOCIAL consists of three primary language based components defining: models of interprocess communication across heterogeneous platforms; models for interprocess coordination, concurrency control, and fault management; and for accessing heterogeneous information resources. DAI applications subsystems, either new or existing, will access these distributed services non-intrusively, via high-level message-based protocols. SOCIAL will reduce the complexity of distributed communications, control, and integration, enabling developers to concentrate on the design and functionality of the target DAI system itself

NASA Technical Reports Server

Querying Large Physics Data Sets Over an Information Grid

Author: Baker Nigel
Brooks Peter
Goff Jean-Marie Le
Kovacs Zsolt
McClatchey Richard
Publication venue
Publication date: 01/01/2001
Field of study

Optimising use of the Web (WWW) for LHC data analysis is a complex problem and illustrates the challenges arising from the integration of and computation across massive amounts of information distributed worldwide. Finding the right piece of information can, at times, be extremely time-consuming, if not impossible. So-called Grids have been proposed to facilitate LHC computing and many groups have embarked on studies of data replication, data migration and networking philosophies. Other aspects such as the role of 'middleware' for Grids are emerging as requiring research. This paper positions the need for appropriate middleware that enables users to resolve physics queries across massive data sets. It identifies the role of meta-data for query resolution and the importance of Information Grids for high-energy physics analysis rather than just Computational or Data Grids. This paper identifies software that is being implemented at CERN to enable the querying of very large collaborating HEP data-sets, initially being employed for the construction of CMS detectors.Comment: 4 pages, 3 figure

arXiv.org e-Print Archive

CiteSeerX

CERN Document Server

Mathematical problems for complex networks

Author: Liang J
Liu Y
Wang Z
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2012
Field of study

Copyright @ 2012 Zidong Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. This article is made available through the Brunel Open Access Publishing Fund.Complex networks do exist in our lives. The brain is a neural network. The global economy is a network of national economies. Computer viruses routinely spread through the Internet. Food-webs, ecosystems, and metabolic pathways can be represented by networks. Energy is distributed through transportation networks in living organisms, man-made infrastructures, and other physical systems. Dynamic behaviors of complex networks, such as stability, periodic oscillation, bifurcation, or even chaos, are ubiquitous in the real world and often reconfigurable. Networks have been studied in the context of dynamical systems in a range of disciplines. However, until recently there has been relatively little work that treats dynamics as a function of network structure, where the states of both the nodes and the edges can change, and the topology of the network itself often evolves in time. Some major problems have not been fully investigated, such as the behavior of stability, synchronization and chaos control for complex networks, as well as their applications in, for example, communication and bioinformatics

Crossref

Directory of Open Access Journals

Brunel University Research Archive