6 research outputs found
A Prediction-Based Replication Algorithm for Improving Data Availability in Frid Environment
Data replication is a key optimization technique for reducing access latency and managing large data by storing replica of data in a wisely manner. In this paper, we propose a data replication algorithm, called the Prediction-Base Dynamic Replication (PBDR) algorithm that improves file access time. Restricted by the storage capacity, it is essential to design an effective strategy for the replication replacement task. PBDR deletes files by considering four important factors: the number of requests for the replica in the future times, availability, the size of the replica and the last time the replica was requested. Also, it can minimize access latency by selecting the best replica when various sites hold replicas of datasets. The algorithm is simulated using a data grid simulator, OptorSim, developed by European Data Grid projects. The experiment results show that PBDR strategy gives better performance compared to the other algorithms and prevents unnecessary creation of replica which leads to efficient storage usage
A distributed platform for the volunteer execution of workflows on a local area network
Thesis submitted in fulfilment of the requirements for the Degree of Master of Science in Computer ScienceAlbatroz Engineering has developed a framework for over-head power lines inspection data acquisition and analysis, which includes hardware and software. The framework’s software components include inspection data analysis and reporting tools, commonly known as PLMI2 application/platform.
In PLMI2, the analysis of over-head power line maintenance inspection data consists
of a sequence of Automatic Tasks (ATs) interleaved with Manual Tasks (MTs). An AT
consists of a set of algorithms that receives as input one or more datasets, processes them and returns new datasets. In turn, an MT enables human supervisors (also known as lines inspection operators) to correct, improve and validate the results of ATs. ATs run faster than MTs and in the overall work cycle, ATs take less than 10% of total processing time, but still take a few minutes. There is data flow dependency among tasks, which can be modelled with a workflow and even if MTs are omitted from this workflow, it is possible to carry the sequence of ATs, postponing MTs.
In fact, if the computing cost and waiting time are negligible, it may be advantageous
to run ATs earlier in the workflow, prior to validation. To address this opportunity, Albatroz Engineering has invested in a new procedure to stream the data through all ATs
fully unattended.
Considering these scenarios, it could be useful to have a system capable of detecting
available workstations at a given instant and subsequently distribute the ATs to them.
In this way, operators could schedule the execution of future ATs for a given inspection data, while they are performing MTs of another.
The requirements of the system to implement fall within the field Volunteer Computing
Systems and we will address some of the challenges posed by these kinds of systems,
namely the hosts volatility and failures. Volunteer Computing is a type of distributed
computing which exploits idle CPU cycles from computing resources donated by volunteers and connected through the Internet/Intranet to compute large-scale simulations.
This thesis proposes and designs a new distributed task scheduling system in the context of Volunteer Computing Systems, able to schedule the ATs of PLMI2 and exploit
idle CPU cycles from workstations within the company’s local area network (LAN) to
accelerate the data analysis, being aware of data flow interdependencies.
To evaluate the proposed system, a prototype has been implemented, and the simulations
results have shown that it is scalable and supports fault-tolerance of tasks execution,
by employing the rescheduling mechanism
Cost-effective resource management for distributed computing
Current distributed computing and resource management infrastructures (e.g., Cluster and Grid) suffer
from a wide variety of problems related to resource management, which include scalability bottleneck,
resource allocation delay, limited quality-of-service (QoS) support, and lack of cost-aware and service
level agreement (SLA) mechanisms.
This thesis addresses these issues by presenting a cost-effective resource management solution
which introduces the possibility of managing geographically distributed resources in resource units that
are under the control of a Virtual Authority (VA). A VA is a collection of resources controlled, but not
necessarily owned, by a group of users or an authority representing a group of users. It leverages the
fact that different resources in disparate locations will have varying usage levels. By creating smaller
divisions of resources called VAs, users would be given the opportunity to choose between a variety of
cost models, and each VA could rent resources from resource providers when necessary, or could potentially
rent out its own resources when underloaded. The resource management is simplified since the
user and owner of a resource recognize only the VA because all permissions and charges are associated
directly with the VA. The VA is controlled by a ’rental’ policy which is supported by a pool of resources
that the system may rent from external resource providers. As far as scheduling is concerned, the VA is
independent from competitors and can instead concentrate on managing its own resources. As a result,
the VA offers scalable resource management with minimal infrastructure and operating costs.
We demonstrate the feasibility of the VA through both a practical implementation of the prototype
system and an illustration of its quantitative advantages through the use of extensive simulations. First,
the VA concept is demonstrated through a practical implementation of the prototype system. Further, we
perform a cost-benefit analysis of current distributed resource infrastructures to demonstrate the potential
cost benefit of such a VA system. We then propose a costing model for evaluating the cost effectiveness
of the VA approach by using an economic approach that captures revenues generated from applications
and expenses incurred from renting resources. Based on our costing methodology, we present rental
policies that can potentially offer effective mechanisms for running distributed and parallel applications
without a heavy upfront investment and without the cost of maintaining idle resources. By using real
workload trace data, we test the effectiveness of our proposed rental approaches.
Finally, we propose an extension to the VA framework that promotes long-term negotiations and
rentals based on service level agreements or long-term contracts. Based on the extended framework,
we present new SLA-aware policies and evaluate them using real workload traces to demonstrate their effectiveness in improving rental decisions
DRIVE: A Distributed Economic Meta-Scheduler for the Federation of Grid and Cloud Systems
The computational landscape is littered with islands of disjoint resource providers including
commercial Clouds, private Clouds, national Grids, institutional Grids, clusters, and data centers.
These providers are independent and isolated due to a lack of communication and coordination,
they are also often proprietary without standardised interfaces, protocols, or execution environments.
The lack of standardisation and global transparency has the effect of binding consumers
to individual providers. With the increasing ubiquity of computation providers there is an opportunity
to create federated architectures that span both Grid and Cloud computing providers
effectively creating a global computing infrastructure. In order to realise this vision, secure and
scalable mechanisms to coordinate resource access are required. This thesis proposes a generic
meta-scheduling architecture to facilitate federated resource allocation in which users can provision
resources from a range of heterogeneous (service) providers.
Efficient resource allocation is difficult in large scale distributed environments due to the inherent
lack of centralised control. In a Grid model, local resource managers govern access to a
pool of resources within a single administrative domain but have only a local view of the Grid
and are unable to collaborate when allocating jobs. Meta-schedulers act at a higher level able to
submit jobs to multiple resource managers, however they are most often deployed on a per-client
basis and are therefore concerned with only their allocations, essentially competing against one
another. In a federated environment the widespread adoption of utility computing models seen in
commercial Cloud providers has re-motivated the need for economically aware meta-schedulers.
Economies provide a way to represent the different goals and strategies that exist in a competitive
distributed environment. The use of economic allocation principles effectively creates an
open service market that provides efficient allocation and incentives for participation.
The major contributions of this thesis are the architecture and prototype implementation of the
DRIVE meta-scheduler. DRIVE is a Virtual Organisation (VO) based distributed economic metascheduler
in which members of the VO collaboratively allocate services or resources. Providers
joining the VO contribute obligation services to the VO. These contributed services are in effect
membership “dues” and are used in the running of the VOs operations – for example allocation,
advertising, and general management. DRIVE is independent from a particular class of provider
(Service, Grid, or Cloud) or specific economic protocol. This independence enables allocation in
federated environments composed of heterogeneous providers in vastly different scenarios. Protocol
independence facilitates the use of arbitrary protocols based on specific requirements and
infrastructural availability. For instance, within a single organisation where internal trust exists,
users can achieve maximum allocation performance by choosing a simple economic protocol.
In a global utility Grid no such trust exists. The same meta-scheduler architecture can be used
with a secure protocol which ensures the allocation is carried out fairly in the absence of trust.
DRIVE establishes contracts between participants as the result of allocation. A contract describes
individual requirements and obligations of each party. A unique two stage contract negotiation
protocol is used to minimise the effect of allocation latency. In addition due to the co-op nature of
the architecture and the use of secure privacy preserving protocols, DRIVE can be deployed in a
distributed environment without requiring large scale dedicated resources.
This thesis presents several other contributions related to meta-scheduling and open service
markets. To overcome the perceived performance limitations of economic systems four high utilisation
strategies have been developed and evaluated. Each strategy is shown to improve occupancy,
utilisation and profit using synthetic workloads based on a production Grid trace. The
gRAVI service wrapping toolkit is presented to address the difficulty web enabling existing applications.
The gRAVI toolkit has been extended for this thesis such that it creates economically
aware (DRIVE-enabled) services that can be transparently traded in a DRIVE market without requiring
developer input. The final contribution of this thesis is the definition and architecture of
a Social Cloud – a dynamic Cloud computing infrastructure composed of virtualised resources
contributed by members of a Social network. The Social Cloud prototype is based on DRIVE
and highlights the ease in which dynamic DRIVE markets can be created and used in different
domains
Scheduling policies for processor coallocation in multicluster systems
Building multicluster systems out of multiple, geographically distributed clusters interconnected by high-speed wide-area networks can provide access to a larger computational power and to a wider range of resources. Jobs running on multiclusters and, more generally, in grids, may require (processor) coallocation, i.e., the simultaneous allocation of resources (processors) in different clusters or subsystems of a grid. In this paper, we propose four scheduling policies for processor coallocation in multiclusters, and we assess with simulations their performance under a wide variety of parameter settings. In particular, in our simulations we use synthetic workloads and workloads derived from the logs of actual systems and from runtime measurements. We conclude that although coallocation makes scheduling more difficult and the wide-area communication critically impacts the performance, there is a wide range of realistic applications that may benefit from coallocation. However, unrestricted coallocation is not recommended: Limiting the total job size or the number or the sizes of their components improves performance