9 research outputs found

    Interoperating Grid Infrastructures with the GridWay Metascheduler

    Get PDF
    This paper describes the GridWay Metascheduler and exposes its latest and future developments, mainly related to interoperability and interoperation. GridWay enables large-scale, reliable and efficient sharing of computing resources over grid middleware. To favor interoperability, it shows a modular architecture based on drivers, which access middleware services for resource discovery and monitoring, job execution and management, and file transfer. This paper presents two new execution drivers for BES and CREAM services, and introduces a remote BES interface for GridWay. This interface allows users to access GridWay’s job metascheduling capabilities, using the BES implementation of GridSAM. Thus, GridWay now provides to end-users more possibilities of interoperability and interoperation

    GWpilot: Enabling multi-level scheduling in distributed infrastructures with GridWay and pilot jobs

    Get PDF
    Current systems based on pilot jobs are not exploiting all the scheduling advantages that the technique offers, or they lack compatibility or adaptability. To overcome the limitations or drawbacks in existing approaches, this study presents a different general-purpose pilot system, GWpilot. This system provides individual users or institutions with a more easy-to-use, easy-toinstall, scalable, extendable, flexible and adjustable framework to efficiently run legacy applications. The framework is based on the GridWay meta-scheduler and incorporates the powerful features of this system, such as standard interfaces, fair-share policies, ranking, migration, accounting and compatibility with diverse infrastructures. GWpilot goes beyond establishing simple network overlays to overcome the waiting times in remote queues or to improve the reliability in task production. It properly tackles the characterisation problem in current infrastructures, allowing users to arbitrarily incorporate customised monitoring of resources and their running applications into the system. This functionality allows the new framework to implement innovative scheduling algorithms that accomplish the computational needs of a wide range of calculations faster and more efficiently. The system can also be easily stacked under other software layers, such as self-schedulers. The advanced techniques included by default in the framework result in significant performance improvements even when very short tasks are scheduled

    Resource Brokering in Grid Computing

    Get PDF
    Grid Computing has emerged in the academia and evolved towards the bases of what is currently known as Cloud Computing and Internet of Things (IoT). The vast collection of resources that provide the nature for Grid Computing environment is very complex; multiple administrative domains control access and set policies to the shared computing resources. It is a decentralized environment with geographically distributed computing and storage resources, where each computing resource can be modeled as an autonomous computing entity, yet collectively can work together. This is a class of Cooperative Distributed Systems (CDS). We extend this by applying characteristic of open environments to create a foundation for the next generation of computing platform where entities are free to join a computing environment to provide capabilities and take part as a collective in solving complex problems beyond the capability of a single entity. This thesis is focused on modeling “Computing” as a collective performance of individual autonomous fundamental computing elements interconnected in a “Grid” open environment structure. Each computing element is a node in the Grid. All nodes are interconnected through the “Grid” edges. Resource allocation is done at the edges of the “Grid” where the connected nodes are simply used to perform computation. The analysis put forward in this thesis identifies Grid Computing as a form of computing that occurs at the resource level. The proposed solution, coupled with advancements in technology and evolution of new computing paradigms, sets a new direction for grid computing research. The approach here is a leap forward with the well-defined set of requirements and specifications based on open issues with the focus on autonomy, adaptability and interdependency. The proposed approach examines current model for Grid Protocol Architecture and proposes an extension that addresses the open issues in the diverged set of solutions that have been created

    Efficient multilevel scheduling in grids and clouds with dynamic provisioning

    Get PDF
    Tesis de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Arquitectura de Computadores y Automática, leída el 12-01-2016La consolidación de las grandes infraestructuras para la Computación Distribuida ha resultado en una plataforma de Computación de Alta Productividad que está lista para grandes cargas de trabajo. Los mejores exponentes de este proceso son las federaciones grid actuales. Por otro lado, la Computación Cloud promete ser más flexible, utilizable, disponible y simple que la Computación Grid, cubriendo además muchas más necesidades computacionales que las requeridas para llevar a cabo cálculos distribuidos. En cualquier caso, debido al dinamismo y la heterogeneidad presente en grids y clouds, encontrar la asignación ideal de las tareas computacionales en los recursos disponibles es, por definición un problema NP-completo, y sólo se pueden encontrar soluciones subóptimas para estos entornos. Sin embargo, la caracterización de estos recursos en ambos tipos de infraestructuras es deficitaria. Los sistemas de información disponibles no proporcionan datos fiables sobre el estado de los recursos, lo cual no permite la planificación avanzada que necesitan los diferentes tipos de aplicaciones distribuidas. Durante la última década esta cuestión no ha sido resuelta para la Computación Grid y las infraestructuras cloud establecidas recientemente presentan el mismo problema. En este marco, los planificadores (brokers) sólo pueden mejorar la productividad de las ejecuciones largas, pero no proporcionan ninguna estimación de su duración. La planificación compleja ha sido abordada tradicionalmente por otras herramientas como los gestores de flujos de trabajo, los auto-planificadores o los sistemas de gestión de producción pertenecientes a ciertas comunidades de investigación. Sin embargo, el bajo rendimiento obtenido con estos mecanismos de asignación anticipada (early-binding) es notorio. Además, la diversidad en los proveedores cloud, la falta de soporte de herramientas de planificación y de interfaces de programación estandarizadas para distribuir la carga de trabajo, dificultan la portabilidad masiva de aplicaciones legadas a los entornos cloud...The consolidation of large Distributed Computing infrastructures has resulted in a High-Throughput Computing platform that is ready for high loads, whose best proponents are the current grid federations. On the other hand, Cloud Computing promises to be more flexible, usable, available and simple than Grid Computing, covering also much more computational needs than the ones required to carry out distributed calculations. In any case, because of the dynamism and heterogeneity that are present in grids and clouds, calculating the best match between computational tasks and resources in an effectively characterised infrastructure is, by definition, an NP-complete problem, and only sub-optimal solutions (schedules) can be found for these environments. Nevertheless, the characterisation of the resources of both kinds of infrastructures is far from being achieved. The available information systems do not provide accurate data about the status of the resources that can allow the advanced scheduling required by the different needs of distributed applications. The issue was not solved during the last decade for grids and the cloud infrastructures recently established have the same problem. In this framework, brokers only can improve the throughput of very long calculations, but do not provide estimations of their duration. Complex scheduling was traditionally tackled by other tools such as workflow managers, self-schedulers and the production management systems of certain research communities. Nevertheless, the low performance achieved by these earlybinding methods is noticeable. Moreover, the diversity of cloud providers and mainly, their lack of standardised programming interfaces and brokering tools to distribute the workload, hinder the massive portability of legacy applications to cloud environments...Depto. de Arquitectura de Computadores y AutomáticaFac. de InformáticaTRUEsubmitte

    DRIVE: A Distributed Economic Meta-Scheduler for the Federation of Grid and Cloud Systems

    No full text
    The computational landscape is littered with islands of disjoint resource providers including commercial Clouds, private Clouds, national Grids, institutional Grids, clusters, and data centers. These providers are independent and isolated due to a lack of communication and coordination, they are also often proprietary without standardised interfaces, protocols, or execution environments. The lack of standardisation and global transparency has the effect of binding consumers to individual providers. With the increasing ubiquity of computation providers there is an opportunity to create federated architectures that span both Grid and Cloud computing providers effectively creating a global computing infrastructure. In order to realise this vision, secure and scalable mechanisms to coordinate resource access are required. This thesis proposes a generic meta-scheduling architecture to facilitate federated resource allocation in which users can provision resources from a range of heterogeneous (service) providers. Efficient resource allocation is difficult in large scale distributed environments due to the inherent lack of centralised control. In a Grid model, local resource managers govern access to a pool of resources within a single administrative domain but have only a local view of the Grid and are unable to collaborate when allocating jobs. Meta-schedulers act at a higher level able to submit jobs to multiple resource managers, however they are most often deployed on a per-client basis and are therefore concerned with only their allocations, essentially competing against one another. In a federated environment the widespread adoption of utility computing models seen in commercial Cloud providers has re-motivated the need for economically aware meta-schedulers. Economies provide a way to represent the different goals and strategies that exist in a competitive distributed environment. The use of economic allocation principles effectively creates an open service market that provides efficient allocation and incentives for participation. The major contributions of this thesis are the architecture and prototype implementation of the DRIVE meta-scheduler. DRIVE is a Virtual Organisation (VO) based distributed economic metascheduler in which members of the VO collaboratively allocate services or resources. Providers joining the VO contribute obligation services to the VO. These contributed services are in effect membership “dues” and are used in the running of the VOs operations – for example allocation, advertising, and general management. DRIVE is independent from a particular class of provider (Service, Grid, or Cloud) or specific economic protocol. This independence enables allocation in federated environments composed of heterogeneous providers in vastly different scenarios. Protocol independence facilitates the use of arbitrary protocols based on specific requirements and infrastructural availability. For instance, within a single organisation where internal trust exists, users can achieve maximum allocation performance by choosing a simple economic protocol. In a global utility Grid no such trust exists. The same meta-scheduler architecture can be used with a secure protocol which ensures the allocation is carried out fairly in the absence of trust. DRIVE establishes contracts between participants as the result of allocation. A contract describes individual requirements and obligations of each party. A unique two stage contract negotiation protocol is used to minimise the effect of allocation latency. In addition due to the co-op nature of the architecture and the use of secure privacy preserving protocols, DRIVE can be deployed in a distributed environment without requiring large scale dedicated resources. This thesis presents several other contributions related to meta-scheduling and open service markets. To overcome the perceived performance limitations of economic systems four high utilisation strategies have been developed and evaluated. Each strategy is shown to improve occupancy, utilisation and profit using synthetic workloads based on a production Grid trace. The gRAVI service wrapping toolkit is presented to address the difficulty web enabling existing applications. The gRAVI toolkit has been extended for this thesis such that it creates economically aware (DRIVE-enabled) services that can be transparently traded in a DRIVE market without requiring developer input. The final contribution of this thesis is the definition and architecture of a Social Cloud – a dynamic Cloud computing infrastructure composed of virtualised resources contributed by members of a Social network. The Social Cloud prototype is based on DRIVE and highlights the ease in which dynamic DRIVE markets can be created and used in different domains

    3rd EGEE User Forum

    Get PDF
    We have organized this book in a sequence of chapters, each chapter associated with an application or technical theme introduced by an overview of the contents, and a summary of the main conclusions coming from the Forum for the chapter topic. The first chapter gathers all the plenary session keynote addresses, and following this there is a sequence of chapters covering the application flavoured sessions. These are followed by chapters with the flavour of Computer Science and Grid Technology. The final chapter covers the important number of practical demonstrations and posters exhibited at the Forum. Much of the work presented has a direct link to specific areas of Science, and so we have created a Science Index, presented below. In addition, at the end of this book, we provide a complete list of the institutes and countries involved in the User Forum

    Optimización de la factorización de matrices no negativas en Bioinformática

    Get PDF
    En los últimos años se ha incrementado el interés de la comunidad científica en la Factorización de matrices no negativas (Non-negative Matrix Factorization, NMF). Este método permite transformar un conjunto de datos de grandes dimensiones en una pequeña colección de elementos que poseen semántica propia en el contexto del análisis. En el caso de Bioinformática, NMF suele emplearse como base de algunos métodos de agrupamiento de datos, que emplean un modelo estadístico para determinar el número de clases más favorable. Este modelo requiere de una gran cantidad de ejecuciones de NMF con distintos parámetros de entrada, lo que representa una enorme carga de trabajo a nivel computacional. La mayoría de las implementaciones de NMF han ido quedando obsoletas ante el constante crecimiento de los datos que la comunidad científica busca analizar, bien sea porque los tiempos de cómputo llegan a alargarse hasta convertirse en inviables, o porque el tamaño de esos datos desborda los recursos del sistema. Por ello, esta tesis doctoral se centra en la optimización y paralelización de la factorización NMF, pero no solo a nivel teórico, sino con el objetivo de proporcionarle a la comunidad científica una nueva herramienta para el análisis de datos de origen biológico. NMF expone un alto grado de paralelismo a nivel de datos, de granularidad variable; mientras que los métodos de agrupamiento mencionados anteriormente presentan un paralelismo a nivel de cómputo, ya que las diversas instancias de NMF que se ejecutan son independientes. Por tanto, desde un punto de vista global, se plantea un modelo de optimización por capas donde se emplean diferentes tecnologías de alto rendimiento..
    corecore