715 research outputs found

    Distributed Late-binding Micro-scheduling and Data Caching for Data-Intensive Workflows

    Get PDF
    Tesis inédita de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Arquitectura de Computadores y Automática, leída el 06-07-2015El mundo de hoy en día se encuentra inundado por ingentes cantidades de información digital procedente de muy diversas fuentes. Todo apunta, además, a que esta tendencia se agudizará en el futuro. Ni la industria, ni la sociedad en general, ni, muy particularmente, la ciencia, permanecen indiferentes ante este hecho. Al contrario, se esfuerzan por obtener el máximo provecho de esta información, lo que significa que deben capturarla, transferirla, almacenarla y procesarla puntual y eficientemente, utilizando una amplia gama de recursos computacionales. Pero esta tarea no es siempre sencilla. Un ejemplo representativo de los desafíos que suponen el manejo y procesamiento de grandes cantidades de datos es el de los experimentos de física de partículas del Large Hadron Collider (LHC), en Ginebra, que cada año deben gestionar decenas de petabytes de información. Basándonos en la experiencia de una de estas colaboraciones, hemos estudiado los principales problemas relativos a la gestión de volúmenes de datos masivos y a la ejecución de vastos flujos de trabajo que necesitan consumirlos. En este contexto, hemos desarrollado una arquitectura de propósito general para la planificación y ejecución de flujos de trabajo con importantes requisitos de datos, que hemos llamado Task Queue. Este nuevo sistema aprovecha el modelo de asignación tardía basado en agentes que ha ayudado a los experimentos del LHC a superar los problemas asociados con la heterogeneidad y la complejidad de las grandes infraestructuras grid de computación. Nuestra propuesta presenta varias mejoras con respecto a los sistemas existentes. Los agentes de ejecución de la arquitectura Task Queue comparten una tabla hash distribuida (Distributed Hash Table, DHT) y realizan la asignación de tareas de una manera cooperativa. De esta forma, se evitan los problemas de escalabilidad de los algoritmos centralizados de asignación y se mejoran los tiempos de ejecución. Esta escalabilidad nos permite realizar una microplanificación de grano fino lo cual posibilita nuevas funcionalidades, como la implementación de una cache distribuida en los nodos de ejecución y el uso de la información de ubicación de los datos en las decisiones de asignación de tareas. Esto mejora la eficiencia del procesado de datos y ayuda a aliviar los habitualmente congestionados servicios de almacenamiento del grid. Además, nuestro sistema es más robusto frente a problemas en la interacción con la cola central de tareas y ofrece mejor comportamiento en situaciones con patrones de acceso a datos exigentes o en ausencia de servicios de almacenamiento locales. Todo esto ha sido demostrado en una amplia serie de pruebas de evaluación. Dado que nuestro procedimiento de planificación de tareas distribuido requiere el uso de mensajes de broadcast, también hemos realizado un profundo estudio de las posibles aproximaciones a la implementación de esta operación sobre el DHT Kademlia, el cual es utilizado para la cache de datos compartida. Kademlia ofrece enrutamiento a nodos individuales pero no incluye ninguna primitiva de broadcast. Nuestro trabajo expone las peculiaridades de este sistema, particularmente su métrica basada en la operación XOR, y estudia analíticamente qué técnicas de broadcast pueden ser usadas con él. También se ha desarrollado un modelo que estima la cobertura de nodos en función de la probabilidad que cada mensaje individual alcance su destino correctamente. Como validación, los algoritmos se han implementado y se han evaluado exhaustivamente. Además, proponemos varias técnicas para mejorar los protocolos en situaciones adversas, por ejemplo cuando el sistema presenta una alta rotación de nodos o la tasa de error en las entregas no es despreciable. Esta técnicas incluyen redundancia, reenvío e inundación (flooding), así como combinaciones de las mismas. Presentamos un análisis de las fortalezas y debilidades de los diferentes algoritmos y las mencionadas técnicas complementarias.Depto. de Arquitectura de Computadores y AutomáticaFac. de InformáticaTRUEunpu

    Distributed scheduling and data sharing in late-binding overlays

    Get PDF
    Pull-based late-binding overlays are used in some of today’s largest computational grids. Job agents are submitted to resources with the duty of retrieving real workload from a central queue at runtime. This helps overcome the problems of these very complex environments, namely, heterogeneity, imprecise status information and relatively high failure rates. In addition, the late job assignment allows dynamic adaptation to changes in the grid conditions or user priorities. However, as the scale grows, the central assignment queue may become a bottleneck for the whole system. This article presents a distributed scheduling architecture for late-binding overlays, which addresses these scalability issues. Our system lets execution nodes build a distributed hash table and delegates job matching and assignment to them. This reduces the load on the central server and makes the system much more scalable and robust. Moreover, scalability makes fine-grained scheduling possible, and enables new functionalities like the implementation of a distributed data cache on the execution nodes, which helps alleviate the commonly congested grid storage services

    Data availability in challenging networking environments in presence of failures

    Get PDF
    This Doctoral thesis presents research on improving data availability in challenging networking environments where failures frequently occur. The thesis discusses the data retrieval and transfer mechanisms in challenging networks such as the Grid and the delay-tolerant networking (DTN). The Grid concept has gained adaptation as a solution to high-performance computing challenges that are faced in international research collaborations. Challenging networking is a novel research area in communications. The first part of the thesis introduces the challenges of data availability in environment where resources are scarce. The focus is especially on the challenges faced in the Grid and in the challenging networking scenarios. A literature overview is given to explain the most important research findings and the state of the standardization work in the field. The experimental part of the thesis consists of eight scientific publications and explains how they contribute to research in the field. Focus in on explaining how data transfer mechanisms have been improved from the application and networking layer points of views. Experimental methods for the Grid scenarios comprise of running a newly developed storage application on the existing research infrastructure. A network simulator is extended for the experimentation with challenging networking mechanisms in a network formed by mobile users. The simulator enables to investigate network behavior with a large number of nodes, and with conditions that are difficult to re-instantiate. As a result, recommendations are given for data retrieval and transfer design for the Grid and mobile networks. These recommendations can guide both system architects and application developers in their work. In the case of the Grid research, the results give first indications on the applicability of the erasure correcting codes for data storage and retrieval with the existing Grid data storage tools. In the case of the challenging networks, the results show how an application-aware communication approach can be used to improve data retrieval and communications. Recommendations are presented to enable efficient transfer and management of data items that are large compared to available resources

    Disk Image Storage, Distribution and Caching for Edge Cloud Infrastructures

    Get PDF
    Depto. de Arquitectura de Computadores y AutomáticaFac. de InformáticaFALSEMinisterio de Ciencia e Innovación (MICINN)Comunidad de Madridunpu

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    Prefetching techniques for client server object-oriented database systems

    Get PDF
    The performance of many object-oriented database applications suffers from the page fetch latency which is determined by the expense of disk access. In this work we suggest several prefetching techniques to avoid, or at least to reduce, page fetch latency. In practice no prediction technique is perfect and no prefetching technique can entirely eliminate delay due to page fetch latency. Therefore we are interested in the trade-off between the level of accuracy required for obtaining good results in terms of elapsed time reduction and the processing overhead needed to achieve this level of accuracy. If prefetching accuracy is high then the total elapsed time of an application can be reduced significantly otherwise if the prefetching accuracy is low, many incorrect pages are prefetched and the extra load on the client, network, server and disks decreases the whole system performance. Access pattern of object-oriented databases are often complex and usually hard to predict accurately. The ..

    Quality of Service improvements for real time multimedia applications using next generation network architectures and blockchain in Internet Service Provider cooperative scenario

    Get PDF
    Real time communications are becoming part of our daily life, requiring constrained requisites with the purpose of being enjoyed in harmony by end users. The factors ruling these requisites are Quality of Service parameters of the users' Internet connections. Achieving a satisfactory QoS level for real time communications depends on parameters that are strongly influenced by the quality of the network connections among the Internet Service Providers, which are located in the path between final users and Over The Top service providers that are supplying them with real time services. Final users can be: business people having real time videoconferences, or adopting crytpocurrencies in their exchanges, videogamers playing online games together with others residing in other countries, migrants talking with their relatives or watching their children growing up in their home countries, people with disabilities adopting tecnologies to help them, doctors performing remote surgeries, manufacturers adopting augmented reality devices to perform dangerous tasks. Each of them performing their daily activities are requiring specific QoS parameters to their ISPs, that nowadays seem to be unable to provide them with a satisfactory QoS level for these kinds of real time services. Through the adoption of next generation networks, such as the Information Centric Networking, it would be possible to overcome the QoS problems that nowadays are experienced. By adopting Blockchain technologies, in several use cases, it would be possible to improve those security aspects related to the non-temperability of information and privacy. I started this thesis analyzing next generation architectures enabling real time multimedia communications. In Software Defined Networking, Named Data Networking and Community Information Centric Networking, I highlighted potential approaches to solve QoS problems that are affecting real time multimedia applications. During my experiments I found that applications able to transmit high quality videos, such as 4k or 8k videos, or to directly interact with devices AR/VR enabled are missing for both ICN approaches. Then I proposed a REST interface for the enforcing of a specific QoS parameter, the round trip time (RTT) taking into consideration the specific use case of a game company that connects with the same telecommunication company of the final user. Supposing that the proposed REST APIs have been deployed in the game company and in the ISP, when one or more users are experiencing lag, the game company will try to ask the ISP to reduce the RTT for that specific user or that group of users. This request can be done by performing a call to a method where IP address(es) and the maximum RTT desired are passed. I also proposed other methods, through which it would be possible to retrieve information about the QoS parameters, and exchange, if necessary, an exceeding parameter in change of another one. The proposed REST APIs can also be used in more complex scenarios, where ISPs along the path are chained together, in order to improve the end to end QoS among Over The Top service provider and final users. To store the information exchanged by using the proposed REST APIs, I proposed to adopt a permissioned blockchain, analizying the ISPs cooperative use case with Hyperledger Fabric, where I proposed the adoption of the Proof of Authority consensus algorithm, to increase the throughput in terms of transactions per second. In a specific case that I examined, I am proposing a combination of Information Centric Networking and Blockchain, in an architecture where ISPs are exchanging valuable information regarding final Users, to improve their QoS parameters. I also proposed my smart contract for the gaming delay use case, that can be used to rule the communication among those ISPs that are along the path among OTT and final users. An extension of this work can be done, by defining billing costs for the QoS improvements
    corecore