15 research outputs found
Dispatch: distributed peer-to-peer simulations
Recently there has been an increasing demand for efficient mechanisms of carrying out computations that exhibit coarse grained parallelism. Examples of this class
of problems include simulations involving Monte Carlo methods, computations where
numerous, similar but independent, tasks are performed to solve a large problem or
any solution which relies on ensemble averages where a simulation is run under a variety of initial conditions which are then combined to form the result. With the ever
increasing complexity of such applications, large amounts of computational power are
required over a long period of time. Economic constraints entail deploying specialized
hardware to satisfy this ever increasing computing power.
We address this issue in Dispatch, a peer-to-peer framework for sharing computational power. In contrast to grid computing and other institution-based CPU sharing
systems, Dispatch targets an open environment, one that is accessible to all the users
and does not require any sort of membership or accounts, i.e. any machine connected
to the Internet can be the part of framework. Dispatch allows dynamic and decentralized organization of these computational resources. It empowers users to utilize
heterogeneous computational resources spread across geographic and administrative
boundaries to run their tasks in parallel.
As a first step, we address a number of challenging issues involved in designing
such distributed systems. Some of these issues are forming a decentralized and scalable network of computational resources, finding sufficient number of idle CPUs in
the network for participants, allocating simulation tasks in an optimal manner so as to reduce the computation time, allowing new participants to join the system and run
their task irrespective of their geographical location and facilitating users to interact
with their tasks (pausing, resuming, stopping) in real time and implementing security
features for preventing malicious users from compromising the network and remote
machines.
As a second step, we evaluate the performance of Dispatch on a large-scale network consisting of 10−130 machines. For one particular simulation, we were able
to achieve up to 1500 million iterations per second as compared to 10 million iterations per second on one machine. We also test Dispatch over a wide-area network
where it is deployed on machines that are geographically apart and belong to different
domains
Resource discovery for distributed computing systems: A comprehensive survey
Large-scale distributed computing environments provide a vast amount of heterogeneous computing resources from different sources for resource sharing and distributed computing. Discovering appropriate resources in such environments is a challenge which involves several different subjects. In this paper, we provide an investigation on the current state of resource discovery protocols, mechanisms, and platforms for large-scale distributed environments, focusing on the design aspects. We classify all related aspects, general steps, and requirements to construct a novel resource discovery solution in three categories consisting of structures, methods, and issues. Accordingly, we review the literature, analyzing various aspects for each category
On the construction of decentralised service-oriented orchestration systems
Modern science relies on workflow technology to capture, process, and analyse data obtained from scientific instruments. Scientific workflows are precise descriptions of experiments in which multiple computational tasks are coordinated based on the dataflows between them. Orchestrating scientific workflows presents a significant research challenge: they are typically executed in a manner such that all data pass through a centralised computer server known as the engine, which causes unnecessary network traffic that leads to a performance bottleneck. These workflows are commonly composed of services that perform computation over geographically distributed resources, and involve the management of dataflows between them. Centralised orchestration is clearly not a scalable approach for coordinating services dispersed across distant geographical locations. This thesis presents a scalable decentralised service-oriented orchestration system that relies on a high-level data coordination language for the specification and execution of workflows. This system’s architecture consists of distributed engines, each of which is responsible for executing part of the overall workflow. It exploits parallelism in the workflow by decomposing it into smaller sub-workflows, and determines the most appropriate engines to execute them using computation placement analysis. This permits the workflow logic to be distributed closer to the services providing the data for execution, which reduces the overall data transfer in the workflow and improves its execution time. This thesis provides an evaluation of the presented system which concludes that decentralised orchestration provides scalability benefits over centralised orchestration, and improves the overall performance of executing a service-oriented workflow