401 research outputs found

    Enhancing reliability with Latin Square redundancy on desktop grids.

    Get PDF
    Computational grids are some of the largest computer systems in existence today. Unfortunately they are also, in many cases, the least reliable. This research examines the use of redundancy with permutation as a method of improving reliability in computational grid applications. Three primary avenues are explored - development of a new redundancy model, the Replication and Permutation Paradigm (RPP) for computational grids, development of grid simulation software for testing RPP against other redundancy methods and, finally, running a program on a live grid using RPP. An important part of RPP involves distributing data and tasks across the grid in Latin Square fashion. Two theorems and subsequent proofs regarding Latin Squares are developed. The theorems describe the changing position of symbols between the rows of a standard Latin Square. When a symbol is missing because a column is removed the theorems provide a basis for determining the next row and column where the missing symbol can be found. Interesting in their own right, the theorems have implications for redundancy. In terms of the redundancy model, the theorems allow one to state the maximum makespan in the face of missing computational hosts when using Latin Square redundancy. The simulator software was developed and used to compare different data and task distribution schemes on a simulated grid. The software clearly showed the advantage of running RPP, which resulted in faster completion times in the face of computational host failures. The Latin Square method also fails gracefully in that jobs complete with massive node failure while increasing makespan. Finally an Inductive Logic Program (ILP) for pharmacophore search was executed, using a Latin Square redundancy methodology, on a Condor grid in the Dahlem Lab at the University of Louisville Speed School of Engineering. All jobs completed, even in the face of large numbers of randomly generated computational host failures

    Advances in Grid Computing

    Get PDF
    This book approaches the grid computing with a perspective on the latest achievements in the field, providing an insight into the current research trends and advances, and presenting a large range of innovative research papers. The topics covered in this book include resource and data management, grid architectures and development, and grid-enabled applications. New ideas employing heuristic methods from swarm intelligence or genetic algorithm and quantum encryption are considered in order to explain two main aspects of grid computing: resource management and data management. The book addresses also some aspects of grid computing that regard architecture and development, and includes a diverse range of applications for grid computing, including possible human grid computing system, simulation of the fusion reaction, ubiquitous healthcare service provisioning and complex water systems

    Towards Peer-to-Peer-based Cryptanalysis

    Get PDF
    Abstract-Modern cryptanalytic algorithms require a large amount of computational power. An approach to cope with this requirement is to distribute these algorithms among many computers and to perform the computation massively parallel. However, existing approaches for distributing cryptanalytic algorithms are based on a client/server or a grid architecture. In this paper we propose the usage of peer-to-peer (P2P) technology for distributed cryptanalytic calculations. Our contribution in this paper is three-fold: We first identify the challenges resulting from this approach and provide a classification of algorithms suited for P2P-based computation. Secondly, we discuss and classify some specific cryptanalytic algorithms and their suitability for such an approach. Finally we provide a new, fully decentralized approach for distributing such computationally intensive jobs. Our design takes special care about scalability and the possible untrustworthy nature of the participating peers

    GREEDY SINGLE USER AND FAIR MULTIPLE USERS REPLICA SELECTION DECISION IN DATA GRID

    Get PDF
    Replication in data grids increases data availability, accessibility and reliability. Replicas of datasets are usually distributed to different sites, and the choice of any replica locations has a significant impact. Replica selection algorithms decide the best replica places based on some criteria. To this end, a family of efficient replica selection systems has been proposed (RsDGrid). The problem presented in this thesis is how to select the best replica location that achieve less time, higher QoS, consistency with users' preferences and almost equal users' satisfactions. RsDGrid consists of three systems: A-system, D-system, and M-system. Each of them has its own scope and specifications. RsDGrid switches among these systems according to the decision maker

    Asynchronous Teams and Tasks in a Message Passing Environment

    Get PDF
    As the discipline of scientific computing grows, so too does the "skills gap" between the increasingly complex scientific applications and the efficient algorithms required. Increasing demand for computational power on the march towards exascale requires innovative approaches. Closing the skills gap avoids the many pitfalls that lead to poor utilisation of resources and wasted investment. This thesis tackles two challenges: asynchronous algorithms for parallel computing and fault tolerance. First I present a novel asynchronous task invocation methodology for Discontinuous Galerkin codes called enclave tasking. The approach modifies the parallel ordering of tasks that allows for efficient scaling on dynamic meshes up to 756 cores. It ensures high levels of concurrency and intermixes tasks of different computational properties. Critical tasks along domain boundaries are prioritised for an overlap of computation and communication. The second contribution is the teaMPI library, forming teams of MPI processes exchanging consistency data through an asynchronous "heartbeat". In contrast to previous approaches, teaMPI operates fully asynchronously with reduced overhead. It is also capable of detecting individually slow or failing ranks and inconsistent data among replicas. Finally I provide an outlook into how asynchronous teams using enclave tasking can be combined into an advanced team-based diffusive load balancing scheme. Both concepts are integrated into and contribute towards the ExaHyPE project, a next generation code that solves hyperbolic equation systems on dynamically adaptive cartesian grids

    Fast spatial inference in the homogeneous Ising model

    Get PDF
    The Ising model is important in statistical modeling and inference in many applications, however its normalizing constant, mean number of active vertices and mean spin interaction are intractable. We provide accurate approximations that make it possible to calculate these quantities numerically. Simulation studies indicate good performance when compared to Markov Chain Monte Carlo methods and at a tiny fraction of the time. The methodology is also used to perform Bayesian inference in a functional Magnetic Resonance Imaging activation detection experiment.Comment: 18 pages, 1 figure, 3 table

    Ad hoc cloud computing

    Get PDF
    Commercial and private cloud providers offer virtualized resources via a set of co-located and dedicated hosts that are exclusively reserved for the purpose of offering a cloud service. While both cloud models appeal to the mass market, there are many cases where outsourcing to a remote platform or procuring an in-house infrastructure may not be ideal or even possible. To offer an attractive alternative, we introduce and develop an ad hoc cloud computing platform to transform spare resource capacity from an infrastructure owner’s locally available, but non-exclusive and unreliable infrastructure, into an overlay cloud platform. The foundation of the ad hoc cloud relies on transferring and instantiating lightweight virtual machines on-demand upon near-optimal hosts while virtual machine checkpoints are distributed in a P2P fashion to other members of the ad hoc cloud. Virtual machines found to be non-operational are restored elsewhere ensuring the continuity of cloud jobs. In this thesis we investigate the feasibility, reliability and performance of ad hoc cloud computing infrastructures. We firstly show that the combination of both volunteer computing and virtualization is the backbone of the ad hoc cloud. We outline the process of virtualizing the volunteer system BOINC to create V-BOINC. V-BOINC distributes virtual machines to volunteer hosts allowing volunteer applications to be executed in the sandbox environment to solve many of the downfalls of BOINC; this however also provides the basis for an ad hoc cloud computing platform to be developed. We detail the challenges of transforming V-BOINC into an ad hoc cloud and outline the transformational process and integrated extensions. These include a BOINC job submission system, cloud job and virtual machine restoration schedulers and a periodic P2P checkpoint distribution component. Furthermore, as current monitoring tools are unable to cope with the dynamic nature of ad hoc clouds, a dynamic infrastructure monitoring and management tool called the Cloudlet Control Monitoring System is developed and presented. We evaluate each of our individual contributions as well as the reliability, performance and overheads associated with an ad hoc cloud deployed on a realistically simulated unreliable infrastructure. We conclude that the ad hoc cloud is not only a feasible concept but also a viable computational alternative that offers high levels of reliability and can at least offer reasonable performance, which at times may exceed the performance of a commercial cloud infrastructure

    Resource Brokering in Grid Computing

    Get PDF
    Grid Computing has emerged in the academia and evolved towards the bases of what is currently known as Cloud Computing and Internet of Things (IoT). The vast collection of resources that provide the nature for Grid Computing environment is very complex; multiple administrative domains control access and set policies to the shared computing resources. It is a decentralized environment with geographically distributed computing and storage resources, where each computing resource can be modeled as an autonomous computing entity, yet collectively can work together. This is a class of Cooperative Distributed Systems (CDS). We extend this by applying characteristic of open environments to create a foundation for the next generation of computing platform where entities are free to join a computing environment to provide capabilities and take part as a collective in solving complex problems beyond the capability of a single entity. This thesis is focused on modeling “Computing” as a collective performance of individual autonomous fundamental computing elements interconnected in a “Grid” open environment structure. Each computing element is a node in the Grid. All nodes are interconnected through the “Grid” edges. Resource allocation is done at the edges of the “Grid” where the connected nodes are simply used to perform computation. The analysis put forward in this thesis identifies Grid Computing as a form of computing that occurs at the resource level. The proposed solution, coupled with advancements in technology and evolution of new computing paradigms, sets a new direction for grid computing research. The approach here is a leap forward with the well-defined set of requirements and specifications based on open issues with the focus on autonomy, adaptability and interdependency. The proposed approach examines current model for Grid Protocol Architecture and proposes an extension that addresses the open issues in the diverged set of solutions that have been created
    • 

    corecore