Search CORE

385 research outputs found

High-Performance Cloud Computing: A View of Scientific Applications

Author: Buyya Rajkumar
Pandey Suraj
Vecchiola Christian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

Scientific computing often requires the availability of a massive number of computers for performing large scale experiments. Traditionally, these needs have been addressed by using high-performance computing solutions and installed facilities such as clusters and super computers, which are difficult to setup, maintain, and operate. Cloud computing provides scientists with a completely new model of utilizing the computing infrastructure. Compute resources, storage resources, as well as applications, can be dynamically provisioned (and integrated within the existing infrastructure) on a pay per use basis. These resources can be released when they are no more needed. Such services are often offered within the context of a Service Level Agreement (SLA), which ensure the desired Quality of Service (QoS). Aneka, an enterprise Cloud computing solution, harnesses the power of compute resources by relying on private and public Clouds and delivers to users the desired QoS. Its flexible and service based infrastructure supports multiple programming paradigms that make Aneka address a variety of different scenarios: from finance applications to computational science. As examples of scientific computing in the Cloud, we present a preliminary case study on using Aneka for the classification of gene expression data and the execution of fMRI brain imaging workflow.Comment: 13 pages, 9 figures, conference pape

arXiv.org e-Print Archive

CiteSeerX

Crossref

A Framework for Approximate Optimization of BoT Application Deployment in Hybrid Cloud Environment

Author: Hoseinyfarahabady Mohammadreza
Publication venue: Faculty of Engineering and Information Technologies, School of Information Technologies
Publication date: 01/01/2015
Field of study

We adopt a systematic approach to investigate the efficiency of near-optimal deployment of large-scale CPU-intensive Bag-of-Task applications running on cloud resources with the non-proportional cost to performance ratios. Our analytical solutions perform in both known and unknown running time of the given application. It tries to optimize users' utility by choosing the most desirable tradeoff between the make-span and the total incurred expense. We propose a schema to provide a near-optimal deployment of BoT application regarding users' preferences. Our approach is to provide user with a set of Pareto-optimal solutions, and then she may select one of the possible scheduling points based on her internal utility function. Our framework can cope with uncertainty in the tasks' execution time using two methods, too. First, an estimation method based on a Monte Carlo sampling called AA algorithm is presented. It uses the minimum possible number of sampling to predict the average task running time. Second, assuming that we have access to some code analyzer, code profiling or estimation tools, a hybrid method to evaluate the accuracy of each estimation tool in certain interval times for improving resource allocation decision has been presented. We propose approximate deployment strategies that run on hybrid cloud. In essence, proposed strategies first determine either an estimated or an exact optimal schema based on the information provided from users' side and environmental parameters. Then, we exploit dynamic methods to assign tasks to resources to reach an optimal schema as close as possible by using two methods. A fast yet simple method based on First Fit Decreasing algorithm, and a more complex approach based on the approximation solution of the transformed problem into a subset sum problem. Extensive experiment results conducted on a hybrid cloud platform confirm that our framework can deliver a near optimal solution respecting user's utility function

Sydney eScholarship

The Mini-Grid Framework: Application Programming Support for Ad hoc Volunteer Grids

Author: Venkataraman Neela Narayanan
Publication venue
Publication date: 01/01/2013
Field of study

The IT University of Copenhagen's Repository

Recommended from our members

Personal mobile grids with a honeybee inspired resource scheduler

Author: Kurdi Heba Abdullataif
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2010
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The overall aim of the thesis has been to introduce Personal Mobile Grids (PMGrids) as a novel paradigm in grid computing that scales grid infrastructures to mobile devices and extends grid entities to individual personal users. In this thesis, architectural designs as well as simulation models for PM-Grids are developed. The core of any grid system is its resource scheduler. However, virtually all current conventional grid schedulers do not address the non-clairvoyant scheduling problem, where job information is not available before the end of execution. Therefore, this thesis proposes a honeybee inspired resource scheduling heuristic for PM-Grids (HoPe) incorporating a radical approach to grid resource scheduling to tackle this problem. A detailed design and implementation of HoPe with a decentralised self-management and adaptive policy are initiated. Among the other main contributions are a comprehensive taxonomy of grid systems as well as a detailed analysis of the honeybee colony and its nectar acquisition process (NAP), from the resource scheduling perspective, which have not been presented in any previous work, to the best of our knowledge. PM-Grid designs and HoPe implementation were evaluated thoroughly through a strictly controlled empirical evaluation framework with a well-established heuristic in high throughput computing, the opportunistic scheduling heuristic (OSH), as a benchmark algorithm. Comparisons with optimal values and worst bounds are conducted to gain a clear insight into HoPe behaviour, in terms of stability, throughput, turnaround time and speedup, under different running conditions of number of jobs and grid scales. Experimental results demonstrate the superiority of HoPe performance where it has successfully maintained optimum stability and throughput in more than 95% of the experiments, with HoPe achieving three times better than the OSH under extremely heavy loads. Regarding the turnaround time and speedup, HoPe has effectively achieved less than 50% of the turnaround time incurred by the OSH, while doubling its speedup in more than 60% of the experiments. These results indicate the potential of both PM-Grids and HoPe in realising futuristic grid visions. Therefore considering the deployment of PM-Grids in real life scenarios and the utilisation of HoPe in other parallel processing and high throughput computing systems are recommended

Brunel University Research Archive

Virtual Organization Clusters: Self-Provisioned Clouds on the Grid

Author: Murphy Michael
Publication venue: Clemson University Libraries
Publication date: 01/05/2010
Field of study

Virtual Organization Clusters (VOCs) provide a novel architecture for overlaying dedicated cluster systems on existing grid infrastructures. VOCs provide customized, homogeneous execution environments on a per-Virtual Organization basis, without the cost of physical cluster construction or the overhead of per-job containers. Administrative access and overlay network capabilities are granted to Virtual Organizations (VOs) that choose to implement VOC technology, while the system remains completely transparent to end users and non-participating VOs. Unlike alternative systems that require explicit leases, VOCs are autonomically self-provisioned according to configurable usage policies. As a grid computing architecture, VOCs are designed to be technology agnostic and are implementable by any combination of software and services that follows the Virtual Organization Cluster Model. As demonstrated through simulation testing and evaluation of an implemented prototype, VOCs are a viable mechanism for increasing end-user job compatibility on grid sites. On existing production grids, where jobs are frequently submitted to a small subset of sites and thus experience high queuing delays relative to average job length, the grid-wide addition of VOCs does not adversely affect mean job sojourn time. By load-balancing jobs among grid sites, VOCs can reduce the total amount of queuing on a grid to a level sufficient to counteract the performance overhead introduced by virtualization

Clemson University: TigerPrints

Using simple PID-inspired controllers for online resilient resource management of distributed scientific workflows

Author: Atkinson Malcolm P.
Deelman Ewa
Filgueira Rosa
Overton Ian M.
Pairo-Castineira Erola
Silva Rafael Ferreira da
Publication venue: 'Elsevier BV'
Publication date: 01/06/2019
Field of study

Scientific workflows have become mainstream for conducting large-scale scientific research. As a result, many workflow applications and Workflow Management Systems (WMSs) have been developed as part of the cyberinfrastructure to allow scientists to execute their applications seamlessly on a range of distributed platforms. Although the scientific community has addressed this challenge from both theoretical and practical approaches, failure prediction, detection, and recovery still raise many research questions. In this paper, we propose an approach inspired by the control theory developed as part of autonomic computing to predict failures before they happen, and mitigated them when possible. The proposed approach is inspired on the proportional–integral–derivative controller (PID controller) control loop mechanism, which is widely used in industrial control systems, where the controller will react to adjust its output to mitigate faults. PID controllers aim to detect the possibility of a non-steady state far enough in advance so that an action can be performed to prevent it from happening. To demonstrate the feasibility of the approach, we tackle two common execution faults of large scale data-intensive workflows—data storage overload and memory overflow. We developed a simulator, which implements and evaluates simple standalone PID-inspired controllers to autonomously manage data and memory usage of a data-intensive bioinformatics workflow that consumes/produces over 4.4 TB of data, and requires over 24 TB of memory to run all tasks concurrently. Experimental results obtained via simulation indicate that workflow executions may significantly benefit from the controller-inspired approach, in particular under online and unknown conditions. Simulation results show that nearly-optimal executions (slowdown of 1.01) can be attained when using our proposed method, and faults are detected and mitigated far in advance of their occurrence

Queen's University Belfast Research Portal

Heriot Watt Pure

Edinburgh Research Explorer

University of St. Andrews - Pure

NERC Open Research Archive

Programming distributed and adaptable autonomous components--the GCM/ProActive framework

Author: Baude Françoise
Henrio Ludovic
Ruz Cristian
Publication venue: Wiley
Publication date: 01/05/2014
Field of study

International audienceComponent-oriented software has become a useful tool to build larger and more complex systems by describing the application in terms of encapsulated, loosely coupled entities called components. At the same time, asynchronous programming patterns allow for the development of efficient distributed applications. While several component models and frameworks have been proposed, most of them tightly integrate the component model with the middleware they run upon. This intertwining is generally implicit and not discussed, leading to entangled, hard to maintain code. This article describes our efforts in the development of the GCM/ProActive framework for providing distributed and adaptable autonomous components. GCM/ProActive integrates a component model designed for execution on large-scale environments, with a programming model based on active objects allowing a high degree of distribution and concurrency. This new integrated model provides a more powerful development, composition, and execution environment than other distributed component frameworks. We illustrate that GCM/ProActive is particularly adapted to the programming of autonomic component systems, and to the integration into a service-oriented environment

HAL-UNICE

INRIA a CCSD electronic archive server

Scientific Workflow Scheduling for Cloud Computing Environments

Author: Lopez Israel Casas
Publication venue: Faculty of Engineering and Information Technologies
Publication date: 01/01/2017
Field of study

The scheduling of workflow applications consists of assigning their tasks to computer resources to fulfill a final goal such as minimizing total workflow execution time. For this reason, workflow scheduling plays a crucial role in efficiently running experiments. Workflows often have many discrete tasks and the number of different task distributions possible and consequent time required to evaluate each configuration quickly becomes prohibitively large. A proper solution to the scheduling problem requires the analysis of tasks and resources, production of an accurate environment model and, most importantly, the adaptation of optimization techniques. This study is a major step toward solving the scheduling problem by not only addressing these issues but also optimizing the runtime and reducing monetary cost, two of the most important variables. This study proposes three scheduling algorithms capable of answering key issues to solve the scheduling problem. Firstly, it unveils BaRRS, a scheduling solution that exploits parallelism and optimizes runtime and monetary cost. Secondly, it proposes GA-ETI, a scheduler capable of returning the number of resources that a given workflow requires for execution. Finally, it describes PSO-DS, a scheduler based on particle swarm optimization to efficiently schedule large workflows. To test the algorithms, five well-known benchmarks are selected that represent different scientific applications. The experiments found the novel algorithms solutions substantially improve efficiency, reducing makespan by 11% to 78%. The proposed frameworks open a path for building a complete system that encompasses the capabilities of a workflow manager, scheduler, and a cloud resource broker in order to offer scientists a single tool to run computationally intensive applications

Sydney eScholarship

Dispatch: distributed peer-to-peer simulations

Author: Patel Kunal S.
Publication venue
Publication date: 15/05/2009
Field of study

Recently there has been an increasing demand for efficient mechanisms of carrying out computations that exhibit coarse grained parallelism. Examples of this class of problems include simulations involving Monte Carlo methods, computations where numerous, similar but independent, tasks are performed to solve a large problem or any solution which relies on ensemble averages where a simulation is run under a variety of initial conditions which are then combined to form the result. With the ever increasing complexity of such applications, large amounts of computational power are required over a long period of time. Economic constraints entail deploying specialized hardware to satisfy this ever increasing computing power. We address this issue in Dispatch, a peer-to-peer framework for sharing computational power. In contrast to grid computing and other institution-based CPU sharing systems, Dispatch targets an open environment, one that is accessible to all the users and does not require any sort of membership or accounts, i.e. any machine connected to the Internet can be the part of framework. Dispatch allows dynamic and decentralized organization of these computational resources. It empowers users to utilize heterogeneous computational resources spread across geographic and administrative boundaries to run their tasks in parallel. As a first step, we address a number of challenging issues involved in designing such distributed systems. Some of these issues are forming a decentralized and scalable network of computational resources, finding sufficient number of idle CPUs in the network for participants, allocating simulation tasks in an optimal manner so as to reduce the computation time, allowing new participants to join the system and run their task irrespective of their geographical location and facilitating users to interact with their tasks (pausing, resuming, stopping) in real time and implementing security features for preventing malicious users from compromising the network and remote machines. As a second step, we evaluate the performance of Dispatch on a large-scale network consisting of 10−130 machines. For one particular simulation, we were able to achieve up to 1500 million iterations per second as compared to 10 million iterations per second on one machine. We also test Dispatch over a wide-area network where it is deployed on machines that are geographically apart and belong to different domains

Texas A&M Repository