164 research outputs found
Accelerating time to scientific discovery with a grid-enhanced microsoft project
The composition, execution, and monitoring of challenging scientific applications is often a complex affair. To cope with the issue of workflow management, several tools and frameworks have been designed and put into use. However, the entry barrier to using these tools productively is high, and may hinder the progress of many scientists, or nonexperts, that develop workflows infrequently. As part of our Cyberaide framework we enable workflow definition, execution and monitoring through the Microsoft Project software package. The motivation for this choice is that many scientists are already familiar with Microsoft Project, a project management software package that is perceived to be user friendly. Through our tool we have the ability to seamlessly access Grids, such as the NSF sponsored TeraGrid. Cyberaide abstractions have also the potential to allow integration with other resources, including Microsoft HPC clusters. We test our hypothesis of usability while evaluating the tool as part of several graduate level courses taught in the field of Grid and Cloud computing
A cumulus project: design and implementation
The Cloud computing becomes an innovative computing paradigm, which aims to provide reliable, customized and QoS guaranteed computing infrastructures for users. This paper presents our early experience of Cloud computing based on the Cumulus project for compute centers. In this paper, we introduce the Cumulus project with its various aspects, such as design pattern, infrastructure, and middleware
Cloud Computing: A Perspective Study
The Cloud computing emerges as a new computing paradigm which aims to provide reliable, customized and QoS guaranteed dynamic computing environments for end-users. In this paper, we study the Cloud computing paradigm from various aspects, such as definitions, distinct features, and enabling technologies. This paper brings an introductional review on the Cloud computing and provide the state-of-the-art of Cloud computing technologies
09131 Abstracts Collection -- Service Level Agreements in Grids
From 22.03. to 27.03.09, the Dagstuhl Seminar 09131 ``Service Level Agreements in Grids \u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics.
During the seminar, several participants presented their current
research, and ongoing work and open problems were discussed. Abstracts of
the presentations given during the seminar as well as abstracts of
seminar results and ideas are put together in this paper. The first section
describes the seminar topics and goals in general.
Links to extended abstracts or full papers are provided, if available
Active thermo chemical tables: thermochemistry for the 21st century
Active Thermochemical Tables (ATcT) are a good example of a significant breakthrough in chemical science that is directly enabled by the US DOE SciDAC initiative. ATcT is a new paradigm of how to obtain accurate, reliable, and internally consistent thermochemistry and overcome the limitations that are intrinsic to the traditional sequential approach to thermochemistry. The availability of high-quality consistent thermochemical values is critical in many areas of chemistry, including the development of realistic predictive models of complex chemical environments such as combustion or the atmosphere, or development and improvement of sophisticated high-fidelity electronic structure computational treatments. As opposed to the traditional sequential evolution of thermochemical values for the chemical species of interest, ATcT utilizes the Thermochemical Network (TN) approach. This approach explicitly exposes the maze of inherent interdependencies normally ignored by the conventional treatment, and allows, inter alia, a statistical analysis of the individual measurements that define the TN. The end result is the extraction of the best possible thermochemistry, based on optimal use of all the currently available knowledge, hence making conventional tabulations of thermochemical values obsolete. Moreover, ATcT offer a number of additional features that are neither present nor possible in the traditional approach. With ATcT, new knowledge can be painlessly propagated through all affected thermochemical values. ATcT also allows hypothesis testing and evaluation, as well as discovery of weak links in the TN. The latter provides pointers to new experimental or theoretical determinations that can most efficiently improve the underlying thermochemical body of knowledge.Active Thermochemical Tables (ATcT) are a good example of a significant breakthrough in chemical science that is directly enabled by the US DOE SciDAC initiative. ATcT is a new paradigm of how to obtain accurate, reliable, and internally consistent thermochemistry and overcome the limitations that are intrinsic to the traditional sequential approach to thermochemistry. The availability of high-quality consistent thermochemical values is critical in many areas of chemistry, including the development of realistic predictive models of complex chemical environments such as combustion or the atmosphere, or development and improvement of sophisticated high-fidelity electronic structure computational treatments. As opposed to the traditional sequential evolution of thermochemical values for the chemical species of interest, ATcT utilizes the Thermochemical Network (TN) approach. This approach explicitly exposes the maze of inherent interdependencies normally ignored by the conventional treatment, and allows, inter alia, a statistical analysis of the individual measurements that define the TN. The end result is the extraction of the best possible thermochemistry, based on optimal use of all the currently available knowledge, hence making conventional tabulations of thermochemical values obsolete. Moreover, ATcT offer a number of additional features that are neither present nor possible in the traditional approach. With ATcT, new knowledge can be painlessly propagated through all affected thermochemical values. ATcT also allows hypothesis testing and evaluation, as well as discovery of weak links in the TN. The latter provides pointers to new experimental or theoretical determinations that can most efficiently improve the underlying thermochemical body of knowledge
Whitepaper on Reusable Hybrid and Multi-Cloud Analytics Service Framework
Over the last several years, the computation landscape for conducting data
analytics has completely changed. While in the past, a lot of the activities
have been undertaken in isolation by companies, and research institutions,
today's infrastructure constitutes a wealth of services offered by a variety of
providers that offer opportunities for reuse, and interactions while leveraging
service collaboration, and service cooperation.
This document focuses on expanding analytics services to develop a framework
for reusable hybrid multi-service data analytics. It includes (a) a short
technology review that explicitly targets the intersection of hybrid
multi-provider analytics services, (b) a small motivation based on use cases we
looked at, (c) enhancing the concepts of services to showcase how hybrid, as
well as multi-provider services can be integrated and reused via the proposed
framework, (d) address analytics service composition, and (e) integrate
container technologies to achieve state-of-the-art analytics service deploymen
A UNICORE Globus Interoperability Layer
For several years, UNICORE and Globus have co-existed as approaches to exploiting what has become known as the ``Grid''. Both offer many services beneficial for creating and using production Grids. A cooperative approach, providing interoperability between Globus and UNICORE, would result in an advanced set of Grid services that gain strength from each other. This paper outlines some of these parallels and differences as they relate to the development of an interoperability layer between UNICORE and Globus. Given the increasing ubiquity of Globus, what emerges is the desire for a hybridised facility that utilises the UNICORE work-flow management of complex, multi-site tasks, but that can run on either UNICORE- or Globus-enabled resources. The technical challenge in achieving this, addressed in this paper, consists of mapping resource descriptions from both grid environments to an abstract format appropriate to work-flow preparation, and then the instantiation of work-flow tasks on the target systems. Other issues such as reconciling disparate security models and file transfer support are also addressed
In-depth Analysis On Parallel Processing Patterns for High-Performance Dataframes
The Data Science domain has expanded monumentally in both research and
industry communities during the past decade, predominantly owing to the Big
Data revolution. Artificial Intelligence (AI) and Machine Learning (ML) are
bringing more complexities to data engineering applications, which are now
integrated into data processing pipelines to process terabytes of data.
Typically, a significant amount of time is spent on data preprocessing in these
pipelines, and hence improving its e fficiency directly impacts the overall
pipeline performance. The community has recently embraced the concept of
Dataframes as the de-facto data structure for data representation and
manipulation. However, the most widely used serial Dataframes today (R, pandas)
experience performance limitations while working on even moderately large data
sets. We believe that there is plenty of room for improvement by taking a look
at this problem from a high-performance computing point of view. In a prior
publication, we presented a set of parallel processing patterns for distributed
dataframe operators and the reference runtime implementation, Cylon [1]. In
this paper, we are expanding on the initial concept by introducing a cost model
for evaluating the said patterns. Furthermore, we evaluate the performance of
Cylon on the ORNL Summit supercomputer
- …