1,308 research outputs found

    Partitioning workflow applications over federated clouds to meet non-functional requirements

    Get PDF
    PhD ThesisWith cloud computing, users can acquire computer resources when they need them on a pay-as-you-go business model. Because of this, many applications are now being deployed in the cloud, and there are many di erent cloud providers worldwide. Importantly, all these various infrastructure providers o er services with di erent levels of quality. For example, cloud data centres are governed by the privacy and security policies of the country where the centre is located, while many organisations have created their own internal \private cloud" to meet security needs. With all this varieties and uncertainties, application developers who decide to host their system in the cloud face the issue of which cloud to choose to get the best operational conditions in terms of price, reliability and security. And the decision becomes even more complicated if their application consists of a number of distributed components, each with slightly di erent requirements. Rather than trying to identify the single best cloud for an application, this thesis considers an alternative approach, that is, combining di erent clouds to meet users' non-functional requirements. Cloud federation o ers the ability to distribute a single application across two or more clouds, so that the application can bene t from the advantages of each one of them. The key challenge for this approach is how to nd the distribution (or deployment) of application components, which can yield the greatest bene ts. In this thesis, we tackle this problem and propose a set of algorithms, and a framework, to partition a work ow-based application over federated clouds in order to exploit the strengths of each cloud. The speci c goal is to split a distributed application structured as a work ow such that the security and reliability requirements of each component are met, whilst the overall cost of execution is minimised. To achieve this, we propose and evaluate a cloud broker for partitioning a work ow application over federated clouds. The broker integrates with the e-Science Central cloud platform to automatically deploy a work ow over public and private clouds. We developed a deployment planning algorithm to partition a large work ow appli- - i - cation across federated clouds so as to meet security requirements and minimise the monetary cost. A more generic framework is then proposed to model, quantify and guide the partitioning and deployment of work ows over federated clouds. This framework considers the situation where changes in cloud availability (including cloud failure) arise during work ow execution

    Using Blockchain to support Data & Service Monetization

    Get PDF
    Two required features of a data monetization platform are query and retrieval of the metadata of the resources to be monetized. Centralized platforms rely on the maturity of traditional NoSQL database systems to support these features. These databases, for example, MongoDB allows for very efficient query and retrieval of data it stores. However, centralized platforms come with a bag of security and privacy concerns, making them not the ideal approach for a data monetization platform. On the other hand, most existing decentralized platforms are only partially decentralized. In this research, I developed Cowry, a platform for publishing metadata describing available resources (data or services), discovery of published metadata including fast search and filtering. My main contribution is a fully decentralized architecture that combines blockchain and traditional distributed database to gain additional features such as efficient query and retrieval of metadata stored on the blockchain

    Trustworthy Federated Learning: A Survey

    Full text link
    Federated Learning (FL) has emerged as a significant advancement in the field of Artificial Intelligence (AI), enabling collaborative model training across distributed devices while maintaining data privacy. As the importance of FL increases, addressing trustworthiness issues in its various aspects becomes crucial. In this survey, we provide an extensive overview of the current state of Trustworthy FL, exploring existing solutions and well-defined pillars relevant to Trustworthy . Despite the growth in literature on trustworthy centralized Machine Learning (ML)/Deep Learning (DL), further efforts are necessary to identify trustworthiness pillars and evaluation metrics specific to FL models, as well as to develop solutions for computing trustworthiness levels. We propose a taxonomy that encompasses three main pillars: Interpretability, Fairness, and Security & Privacy. Each pillar represents a dimension of trust, further broken down into different notions. Our survey covers trustworthiness challenges at every level in FL settings. We present a comprehensive architecture of Trustworthy FL, addressing the fundamental principles underlying the concept, and offer an in-depth analysis of trust assessment mechanisms. In conclusion, we identify key research challenges related to every aspect of Trustworthy FL and suggest future research directions. This comprehensive survey serves as a valuable resource for researchers and practitioners working on the development and implementation of Trustworthy FL systems, contributing to a more secure and reliable AI landscape.Comment: 45 Pages, 8 Figures, 9 Table

    Architecture for Provenance Systems

    No full text
    This document covers the logical and process architectures of provenance systems. The logical architecture identifies key roles and their interactions, whereas the process architecture discusses distribution and security. A fundamental aspect of our presentation is its technology-independent nature, which makes it reusable: the principles that are exposed in this document may be applied to different technologies

    Distributed Management of Grid-based Scientific Workflows

    Get PDF
    Grids and service-oriented technologies are emerging as dominant approaches for distributed systems. With the evolution of these technologies, scientific workflows have been introduced as a tool for scientists to assemble highly specialized applications, and to exchange large heterogeneous datasets in order to automate and accelerate the accomplishment of complex scientific tasks. Several Scientific Workflow Management Systems (SWfMS) have already been designed to support the specification, execution, and monitoring of scientific workflows. Meanwhile, they still face key challenges from two different perspectives: system usability and system efficiency. From the system usability perspective, current SWfMS are not designed to be simple enough for scientists who have quite limited IT knowledge. Whatā€™s more, there is no easy mechanism by which scientists can share and re-use scientific experiments that have already been designed and proved by others. From the perspective of system efficiency, existing SWfMS are coordinating and executing workflows in a centralized fashion using a single scheduler and / or a workflow enactor. This creates a single point of failure, forms a scalability bottleneck, and enforces centralized fault handling. In addition, they donā€™t consider load balancing while mapping abstract jobs onto several computational nodes. Another important challenge exists due to the common nature of scientific workflow applications, that need to exchange a huge amount of data during the execution process. Some available SWfMS use a mediator-based approach for data transfer where data must be transferred first to a centralized data manager, which is completely inefficient. Other SWfMS apply a peer-to-peer approach via data references. Even this approach is not sufficient for scientific workflows as a single complex scientific activity can produce an extensive amount of data. In this thesis, we introduce SWIMS (Scientific Workflow Integration and Management System) framework. It employs the Web Services technology to originate a distributed management system for data-intensive scientific workflows. The purpose of SWIMS is to overcome the previously mentioned challenges through a set of salient features: i) Support for distributed execution and management of workflows, ii) diminution of communication traffic, iii) support for smart re-run, iv) distributed fault handling and load balancing, v) ease of use, and vi) extensive sharing of scientific workflows. We discuss the motivation, design, and implementation of the SWIMS framework. Then, we evaluate it through the Montage application from the astronomy domain

    An Architecture for Provenance Systems

    No full text
    This document covers the logical and process architectures of provenance systems. The logical architecture identifies key roles and their interactions, whereas the process architecture discusses distribution and security. A fundamental aspect of our presentation is its technology-independent nature, which makes it reusable: the principles that are exposed in this document may be applied to different technologies

    Managing scientific data with named data networking

    Get PDF
    Many scientific domains, such as climate science and High Energy Physics (HEP), have data management requirements that are not well supported by the IP network architecture. Named Data Networking (NDN) is a new network architecture whose service model is better aligned with the needs of data-oriented applications. NDN provides features such as best-location retrieval, caching, load sharing, and transparent failover that would otherwise be painstakingly (re-)implemented by each application using point-to-point semantics in an IP network. We present the first scientific data management application designed and implemented on top of NDN. We use this application to manage climate and HEP data over a dedicated, high-performance, testbed. Our application has two main components: a UI for dataset discovery queries and a federation of synchronized name catalogs. We show how NDN primitives can be used to implement common data management operations such as publishing, search, efficient retrieval, and publication access control
    • ā€¦
    corecore