15 research outputs found

    Survey and Analysis of Production Distributed Computing Infrastructures

    Full text link
    This report has two objectives. First, we describe a set of the production distributed infrastructures currently available, so that the reader has a basic understanding of them. This includes explaining why each infrastructure was created and made available and how it has succeeded and failed. The set is not complete, but we believe it is representative. Second, we describe the infrastructures in terms of their use, which is a combination of how they were designed to be used and how users have found ways to use them. Applications are often designed and created with specific infrastructures in mind, with both an appreciation of the existing capabilities provided by those infrastructures and an anticipation of their future capabilities. Here, the infrastructures we discuss were often designed and created with specific applications in mind, or at least specific types of applications. The reader should understand how the interplay between the infrastructure providers and the users leads to such usages, which we call usage modalities. These usage modalities are really abstractions that exist between the infrastructures and the applications; they influence the infrastructures by representing the applications, and they influence the ap- plications by representing the infrastructures

    Campus Bridging: Software & Software Service Issues Workshop Report

    Get PDF
    This report summarizes the discussion at and findings of a workshop on the software and services aspects of cyberinfrastructure as they apply to campus bridging. The workshop took a broad view of software and services, including services in the business sense of the word, such as user support, in addition to information technology services. Specifically, the workshop addressed the following two goals: * Suggest common elements of software stacks widely usable across the nation/world to promote interoperability/economy of scale; and * Suggested policy documents that any research university should have in place.The preparation of this report and related documents was supported by several sources, including: * The National Science Foundation through Grant 0829462 (Bradley C. Wheeler, PI; Geoffrey Brown, Craig A. Stewart, Beth Plale, Dennis Gannon Co-PIs). * Indiana University Pervasive Technology Institute (http://pti.iu.edu/) for funding staff providing logistical support of the task force activities, writing and editorial staff, and layout and production of the final report document. * RENCI (the Renaissance Computing Institute, http://www.renci.org/) supported this workshop and report by generously providing the time and effort of John McGee. * Texas A&M University (http://www.tamu.edu) supported this workshop and report by generously providing the time and effort of Guy Almes

    An enhanced dynamic replica creation and eviction mechanism in data grid federation environment

    Get PDF
    Data Grid Federation system is an infrastructure that connects several grid systems, which facilitates sharing of large amount of data, as well as storage and computing resources. The existing mechanisms on data replication focus on finding file values based on the number of files access in deciding which file to replicate, and place new replicas on locations that provide minimum read cost. DRCEM finds file values based on logical dependencies in deciding which file to replicate, and allocates new replicas on locations that provide minimum replica placement cost. This thesis presents an enhanced data replication strategy known as Dynamic Replica Creation and Eviction Mechanism (DRCEM) that utilizes the usage of data grid resources, by allocating appropriate replica sites around the federation. The proposed mechanism uses three schemes: 1) Dynamic Replica Evaluation and Creation Scheme, 2) Replica Placement Scheme, and 3) Dynamic Replica Eviction Scheme. DRCEM was evaluated using OptorSim network simulator based on four performance metrics: 1) Jobs Completion Times, 2) Effective Network Usage, 3) Storage Element Usage, and 4) Computing Element Usage. DRCEM outperforms ELALW and DRCM mechanisms by 30% and 26%, in terms of Jobs Completion Times. In addition, DRCEM consumes less storage compared to ELALW and DRCM by 42% and 40%. However, DRCEM shows lower performance compared to existing mechanisms regarding Computing Element Usage, due to additional computations of files logical dependencies. Results revealed better jobs completion times with lower resource consumption than existing approaches. This research produces three replication schemes embodied in one mechanism that enhances the performance of Data Grid Federation environment. This has contributed to the enhancement of the existing mechanism, which is capable of deciding to either create or evict more than one file during a particular time. Furthermore, files logical dependencies were integrated into the replica creation scheme to evaluate data files more accurately

    QoS Provisioning by Meta-Scheduling in Advance within SLA-Based Grid Environments

    Get PDF
    The establishment of agreements between users and the entities which manage the Grid resources is still a challenging task. On the one hand, an entity in charge of dealing with the communication with the users is needed, with the aim of signing resource usage contracts and also implementing some renegotiation techniques, among others. On the other hand, some mechanisms should be implemented which decide if the QoS requested could be achieved and, in such case, ensuring that the QoS agreement is provided. One way of increasing the probability of achieving the agreed QoS is by performing meta-scheduling of jobs in advance, that is, jobs are scheduled some time before they are actually executed. In this way, it becomes more likely that the appropriate resources are available to run the jobs when needed. So, this paper presents a framework built on top of Globus and the GridWay meta-scheduler to provide QoS by means of performing meta-scheduling in advance. Thanks to this, QoS requirements of jobs are met (i.e. jobs are finished within a deadline). Apart from that, the mechanisms needed to manage the communication between the users and the system are presented and implemented through SLA contracts based on the WS-Agreement specification

    GWpilot: Enabling multi-level scheduling in distributed infrastructures with GridWay and pilot jobs

    Get PDF
    Current systems based on pilot jobs are not exploiting all the scheduling advantages that the technique offers, or they lack compatibility or adaptability. To overcome the limitations or drawbacks in existing approaches, this study presents a different general-purpose pilot system, GWpilot. This system provides individual users or institutions with a more easy-to-use, easy-toinstall, scalable, extendable, flexible and adjustable framework to efficiently run legacy applications. The framework is based on the GridWay meta-scheduler and incorporates the powerful features of this system, such as standard interfaces, fair-share policies, ranking, migration, accounting and compatibility with diverse infrastructures. GWpilot goes beyond establishing simple network overlays to overcome the waiting times in remote queues or to improve the reliability in task production. It properly tackles the characterisation problem in current infrastructures, allowing users to arbitrarily incorporate customised monitoring of resources and their running applications into the system. This functionality allows the new framework to implement innovative scheduling algorithms that accomplish the computational needs of a wide range of calculations faster and more efficiently. The system can also be easily stacked under other software layers, such as self-schedulers. The advanced techniques included by default in the framework result in significant performance improvements even when very short tasks are scheduled

    3rd EGEE User Forum

    Get PDF
    We have organized this book in a sequence of chapters, each chapter associated with an application or technical theme introduced by an overview of the contents, and a summary of the main conclusions coming from the Forum for the chapter topic. The first chapter gathers all the plenary session keynote addresses, and following this there is a sequence of chapters covering the application flavoured sessions. These are followed by chapters with the flavour of Computer Science and Grid Technology. The final chapter covers the important number of practical demonstrations and posters exhibited at the Forum. Much of the work presented has a direct link to specific areas of Science, and so we have created a Science Index, presented below. In addition, at the end of this book, we provide a complete list of the institutes and countries involved in the User Forum

    Efficient multilevel scheduling in grids and clouds with dynamic provisioning

    Get PDF
    Tesis de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Arquitectura de Computadores y Automática, leída el 12-01-2016La consolidación de las grandes infraestructuras para la Computación Distribuida ha resultado en una plataforma de Computación de Alta Productividad que está lista para grandes cargas de trabajo. Los mejores exponentes de este proceso son las federaciones grid actuales. Por otro lado, la Computación Cloud promete ser más flexible, utilizable, disponible y simple que la Computación Grid, cubriendo además muchas más necesidades computacionales que las requeridas para llevar a cabo cálculos distribuidos. En cualquier caso, debido al dinamismo y la heterogeneidad presente en grids y clouds, encontrar la asignación ideal de las tareas computacionales en los recursos disponibles es, por definición un problema NP-completo, y sólo se pueden encontrar soluciones subóptimas para estos entornos. Sin embargo, la caracterización de estos recursos en ambos tipos de infraestructuras es deficitaria. Los sistemas de información disponibles no proporcionan datos fiables sobre el estado de los recursos, lo cual no permite la planificación avanzada que necesitan los diferentes tipos de aplicaciones distribuidas. Durante la última década esta cuestión no ha sido resuelta para la Computación Grid y las infraestructuras cloud establecidas recientemente presentan el mismo problema. En este marco, los planificadores (brokers) sólo pueden mejorar la productividad de las ejecuciones largas, pero no proporcionan ninguna estimación de su duración. La planificación compleja ha sido abordada tradicionalmente por otras herramientas como los gestores de flujos de trabajo, los auto-planificadores o los sistemas de gestión de producción pertenecientes a ciertas comunidades de investigación. Sin embargo, el bajo rendimiento obtenido con estos mecanismos de asignación anticipada (early-binding) es notorio. Además, la diversidad en los proveedores cloud, la falta de soporte de herramientas de planificación y de interfaces de programación estandarizadas para distribuir la carga de trabajo, dificultan la portabilidad masiva de aplicaciones legadas a los entornos cloud...The consolidation of large Distributed Computing infrastructures has resulted in a High-Throughput Computing platform that is ready for high loads, whose best proponents are the current grid federations. On the other hand, Cloud Computing promises to be more flexible, usable, available and simple than Grid Computing, covering also much more computational needs than the ones required to carry out distributed calculations. In any case, because of the dynamism and heterogeneity that are present in grids and clouds, calculating the best match between computational tasks and resources in an effectively characterised infrastructure is, by definition, an NP-complete problem, and only sub-optimal solutions (schedules) can be found for these environments. Nevertheless, the characterisation of the resources of both kinds of infrastructures is far from being achieved. The available information systems do not provide accurate data about the status of the resources that can allow the advanced scheduling required by the different needs of distributed applications. The issue was not solved during the last decade for grids and the cloud infrastructures recently established have the same problem. In this framework, brokers only can improve the throughput of very long calculations, but do not provide estimations of their duration. Complex scheduling was traditionally tackled by other tools such as workflow managers, self-schedulers and the production management systems of certain research communities. Nevertheless, the low performance achieved by these earlybinding methods is noticeable. Moreover, the diversity of cloud providers and mainly, their lack of standardised programming interfaces and brokering tools to distribute the workload, hinder the massive portability of legacy applications to cloud environments...Depto. de Arquitectura de Computadores y AutomáticaFac. de InformáticaTRUEsubmitte

    DRIVE: A Distributed Economic Meta-Scheduler for the Federation of Grid and Cloud Systems

    No full text
    The computational landscape is littered with islands of disjoint resource providers including commercial Clouds, private Clouds, national Grids, institutional Grids, clusters, and data centers. These providers are independent and isolated due to a lack of communication and coordination, they are also often proprietary without standardised interfaces, protocols, or execution environments. The lack of standardisation and global transparency has the effect of binding consumers to individual providers. With the increasing ubiquity of computation providers there is an opportunity to create federated architectures that span both Grid and Cloud computing providers effectively creating a global computing infrastructure. In order to realise this vision, secure and scalable mechanisms to coordinate resource access are required. This thesis proposes a generic meta-scheduling architecture to facilitate federated resource allocation in which users can provision resources from a range of heterogeneous (service) providers. Efficient resource allocation is difficult in large scale distributed environments due to the inherent lack of centralised control. In a Grid model, local resource managers govern access to a pool of resources within a single administrative domain but have only a local view of the Grid and are unable to collaborate when allocating jobs. Meta-schedulers act at a higher level able to submit jobs to multiple resource managers, however they are most often deployed on a per-client basis and are therefore concerned with only their allocations, essentially competing against one another. In a federated environment the widespread adoption of utility computing models seen in commercial Cloud providers has re-motivated the need for economically aware meta-schedulers. Economies provide a way to represent the different goals and strategies that exist in a competitive distributed environment. The use of economic allocation principles effectively creates an open service market that provides efficient allocation and incentives for participation. The major contributions of this thesis are the architecture and prototype implementation of the DRIVE meta-scheduler. DRIVE is a Virtual Organisation (VO) based distributed economic metascheduler in which members of the VO collaboratively allocate services or resources. Providers joining the VO contribute obligation services to the VO. These contributed services are in effect membership “dues” and are used in the running of the VOs operations – for example allocation, advertising, and general management. DRIVE is independent from a particular class of provider (Service, Grid, or Cloud) or specific economic protocol. This independence enables allocation in federated environments composed of heterogeneous providers in vastly different scenarios. Protocol independence facilitates the use of arbitrary protocols based on specific requirements and infrastructural availability. For instance, within a single organisation where internal trust exists, users can achieve maximum allocation performance by choosing a simple economic protocol. In a global utility Grid no such trust exists. The same meta-scheduler architecture can be used with a secure protocol which ensures the allocation is carried out fairly in the absence of trust. DRIVE establishes contracts between participants as the result of allocation. A contract describes individual requirements and obligations of each party. A unique two stage contract negotiation protocol is used to minimise the effect of allocation latency. In addition due to the co-op nature of the architecture and the use of secure privacy preserving protocols, DRIVE can be deployed in a distributed environment without requiring large scale dedicated resources. This thesis presents several other contributions related to meta-scheduling and open service markets. To overcome the perceived performance limitations of economic systems four high utilisation strategies have been developed and evaluated. Each strategy is shown to improve occupancy, utilisation and profit using synthetic workloads based on a production Grid trace. The gRAVI service wrapping toolkit is presented to address the difficulty web enabling existing applications. The gRAVI toolkit has been extended for this thesis such that it creates economically aware (DRIVE-enabled) services that can be transparently traded in a DRIVE market without requiring developer input. The final contribution of this thesis is the definition and architecture of a Social Cloud – a dynamic Cloud computing infrastructure composed of virtualised resources contributed by members of a Social network. The Social Cloud prototype is based on DRIVE and highlights the ease in which dynamic DRIVE markets can be created and used in different domains
    corecore