2,030 research outputs found
A Robust Fault-Tolerant and Scalable Cluster-wide Deduplication for Shared-Nothing Storage Systems
Deduplication has been largely employed in distributed storage systems to
improve space efficiency. Traditional deduplication research ignores the design
specifications of shared-nothing distributed storage systems such as no central
metadata bottleneck, scalability, and storage rebalancing. Further,
deduplication introduces transactional changes, which are prone to errors in
the event of a system failure, resulting in inconsistencies in data and
deduplication metadata. In this paper, we propose a robust, fault-tolerant and
scalable cluster-wide deduplication that can eliminate duplicate copies across
the cluster. We design a distributed deduplication metadata shard which
guarantees performance scalability while preserving the design constraints of
shared- nothing storage systems. The placement of chunks and deduplication
metadata is made cluster-wide based on the content fingerprint of chunks. To
ensure transactional consistency and garbage identification, we employ a
flag-based asynchronous consistency mechanism. We implement the proposed
deduplication on Ceph. The evaluation shows high disk-space savings with
minimal performance degradation as well as high robustness in the event of
sudden server failure.Comment: 6 Pages including reference
A Taxonomy of Workflow Management Systems for Grid Computing
With the advent of Grid and application technologies, scientists and
engineers are building more and more complex applications to manage and process
large data sets, and execute scientific experiments on distributed resources.
Such application scenarios require means for composing and executing complex
workflows. Therefore, many efforts have been made towards the development of
workflow management systems for Grid computing. In this paper, we propose a
taxonomy that characterizes and classifies various approaches for building and
executing workflows on Grids. We also survey several representative Grid
workflow systems developed by various projects world-wide to demonstrate the
comprehensiveness of the taxonomy. The taxonomy not only highlights the design
and engineering similarities and differences of state-of-the-art in Grid
workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure
Leveraging Secure Multiparty Computation in the Internet of Things
Centralized systems in the Internet of Things---be it local middleware or
cloud-based services---fail to fundamentally address privacy of the collected
data. We propose an architecture featuring secure multiparty computation at its
core in order to realize data processing systems which already incorporate
support for privacy protection in the architecture
Investigation into Mobile Learning Framework in Cloud Computing Platform
Abstract—Cloud computing infrastructure is increasingly
used for distributed applications. Mobile learning
applications deployed in the cloud are a new research
direction. The applications require specific development
approaches for effective and reliable communication. This
paper proposes an interdisciplinary approach for design and
development of mobile applications in the cloud. The
approach includes front service toolkit and backend service
toolkit. The front service toolkit packages data and sends it
to a backend deployed in a cloud computing platform. The
backend service toolkit manages rules and workflow, and
then transmits required results to the front service toolkit.
To further show feasibility of the approach, the paper
introduces a case study and shows its performance
Active Ontology: An Information Integration Approach for Dynamic Information Sources
In this paper we describe an ontology-based information integration approach that is suitable for highly dynamic distributed information sources, such as those available in Grid systems. The main challenges addressed are: 1) information changes frequently and information requests have to be answered quickly in order to provide up-to-date information; and 2) the most suitable information sources have to be selected from a set of different distributed ones that can provide the information needed. To deal with the first challenge we use an information cache that works with an update-on-demand policy. To deal with the second we add an information source selection step to the usual architecture used for ontology-based information integration. To illustrate our approach, we have developed an information service that aggregates metadata available in hundreds of information services of the EGEE Grid infrastructure
- …