2,030 research outputs found

    A Robust Fault-Tolerant and Scalable Cluster-wide Deduplication for Shared-Nothing Storage Systems

    Full text link
    Deduplication has been largely employed in distributed storage systems to improve space efficiency. Traditional deduplication research ignores the design specifications of shared-nothing distributed storage systems such as no central metadata bottleneck, scalability, and storage rebalancing. Further, deduplication introduces transactional changes, which are prone to errors in the event of a system failure, resulting in inconsistencies in data and deduplication metadata. In this paper, we propose a robust, fault-tolerant and scalable cluster-wide deduplication that can eliminate duplicate copies across the cluster. We design a distributed deduplication metadata shard which guarantees performance scalability while preserving the design constraints of shared- nothing storage systems. The placement of chunks and deduplication metadata is made cluster-wide based on the content fingerprint of chunks. To ensure transactional consistency and garbage identification, we employ a flag-based asynchronous consistency mechanism. We implement the proposed deduplication on Ceph. The evaluation shows high disk-space savings with minimal performance degradation as well as high robustness in the event of sudden server failure.Comment: 6 Pages including reference

    A Taxonomy of Workflow Management Systems for Grid Computing

    Full text link
    With the advent of Grid and application technologies, scientists and engineers are building more and more complex applications to manage and process large data sets, and execute scientific experiments on distributed resources. Such application scenarios require means for composing and executing complex workflows. Therefore, many efforts have been made towards the development of workflow management systems for Grid computing. In this paper, we propose a taxonomy that characterizes and classifies various approaches for building and executing workflows on Grids. We also survey several representative Grid workflow systems developed by various projects world-wide to demonstrate the comprehensiveness of the taxonomy. The taxonomy not only highlights the design and engineering similarities and differences of state-of-the-art in Grid workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure

    Leveraging Secure Multiparty Computation in the Internet of Things

    Full text link
    Centralized systems in the Internet of Things---be it local middleware or cloud-based services---fail to fundamentally address privacy of the collected data. We propose an architecture featuring secure multiparty computation at its core in order to realize data processing systems which already incorporate support for privacy protection in the architecture

    Investigation into Mobile Learning Framework in Cloud Computing Platform

    Get PDF
    Abstract—Cloud computing infrastructure is increasingly used for distributed applications. Mobile learning applications deployed in the cloud are a new research direction. The applications require specific development approaches for effective and reliable communication. This paper proposes an interdisciplinary approach for design and development of mobile applications in the cloud. The approach includes front service toolkit and backend service toolkit. The front service toolkit packages data and sends it to a backend deployed in a cloud computing platform. The backend service toolkit manages rules and workflow, and then transmits required results to the front service toolkit. To further show feasibility of the approach, the paper introduces a case study and shows its performance

    Active Ontology: An Information Integration Approach for Dynamic Information Sources

    Get PDF
    In this paper we describe an ontology-based information integration approach that is suitable for highly dynamic distributed information sources, such as those available in Grid systems. The main challenges addressed are: 1) information changes frequently and information requests have to be answered quickly in order to provide up-to-date information; and 2) the most suitable information sources have to be selected from a set of different distributed ones that can provide the information needed. To deal with the first challenge we use an information cache that works with an update-on-demand policy. To deal with the second we add an information source selection step to the usual architecture used for ontology-based information integration. To illustrate our approach, we have developed an information service that aggregates metadata available in hundreds of information services of the EGEE Grid infrastructure
    corecore