Search CORE

9,733 research outputs found

Incremental Consistency Guarantees for Replicated Objects

Author: Guerraoui Rachid
Pavlovic Matej
Seredinschi Dragos-Adrian
Publication venue
Publication date: 08/09/2016
Field of study

Programming with replicated objects is difficult. Developers must face the fundamental trade-off between consistency and performance head on, while struggling with the complexity of distributed storage stacks. We introduce Correctables, a novel abstraction that hides most of this complexity, allowing developers to focus on the task of balancing consistency and performance. To aid developers with this task, Correctables provide incremental consistency guarantees, which capture successive refinements on the result of an ongoing operation on a replicated object. In short, applications receive both a preliminary---fast, possibly inconsistent---result, as well as a final---consistent---result that arrives later. We show how to leverage incremental consistency guarantees by speculating on preliminary values, trading throughput and bandwidth for improved latency. We experiment with two popular storage systems (Cassandra and ZooKeeper) and three applications: a Twissandra-based microblogging service, an ad serving system, and a ticket selling system. Our evaluation on the Amazon EC2 platform with YCSB workloads A, B, and C shows that we can reduce the latency of strongly consistent operations by up to 40% (from 100ms to 60ms) at little cost (10% bandwidth increase, 6% throughput drop) in the ad system. Even if the preliminary result is frequently inconsistent (25% of accesses), incremental consistency incurs a bandwidth overhead of only 27%.Comment: 16 total pages, 12 figures. OSDI'16 (to appear

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Towards Certain Fixes with Editing Rules and Master Data

Author: Fan Wenfei
Li Jianzhong
Ma Shuai
Tang Nan
Yu Wenyuan
Publication venue
Publication date: 01/01/2010
Field of study

Edinburgh Research Explorer

A Robust Fault-Tolerant and Scalable Cluster-wide Deduplication for Shared-Nothing Storage Systems

Author: Hamandawana Prince
Khan Awais
Kim Youngjae
Lee Chang-Gyu
Park Sungyong
Publication venue
Publication date: 20/03/2018
Field of study

Deduplication has been largely employed in distributed storage systems to improve space efficiency. Traditional deduplication research ignores the design specifications of shared-nothing distributed storage systems such as no central metadata bottleneck, scalability, and storage rebalancing. Further, deduplication introduces transactional changes, which are prone to errors in the event of a system failure, resulting in inconsistencies in data and deduplication metadata. In this paper, we propose a robust, fault-tolerant and scalable cluster-wide deduplication that can eliminate duplicate copies across the cluster. We design a distributed deduplication metadata shard which guarantees performance scalability while preserving the design constraints of shared- nothing storage systems. The placement of chunks and deduplication metadata is made cluster-wide based on the content fingerprint of chunks. To ensure transactional consistency and garbage identification, we employ a flag-based asynchronous consistency mechanism. We implement the proposed deduplication on Ceph. The evaluation shows high disk-space savings with minimal performance degradation as well as high robustness in the event of sudden server failure.Comment: 6 Pages including reference

arXiv.org e-Print Archive

Crossref

An approach to rollback recovery of collaborating mobile agents

Author: Bargiela A
Osman T
Wagealla W
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/02/2004
Field of study

Fault-tolerance is one of the main problems that must be resolved to improve the adoption of the agents' computing paradigm. In this paper, we analyse the execution model of agent platforms and the significance of the faults affecting their constituent components on the reliable execution of agent-based applications, in order to develop a pragmatic framework for agent systems fault-tolerance. The developed framework deploys a communication-pairs independent check pointing strategy to offer a low-cost, application-transparent model for reliable agent- based computing that covers all possible faults that might invalidate reliable agent execution, migration and communication and maintains the exactly-one execution property

Crossref

Nottingham Trent Institutional Repository (IRep)

Automatic test cases generation from software specifications modules

Author: Alhroob Aysh
Dahal Keshav
Hossain Alamgir
Publication venue: Wroclaw University of Technology
Publication date: 01/01/2010
Field of study

A new technique is proposed in this paper to extend the Integrated Classification Tree Methodology (ICTM) developed by Chen et al. [13] This software assists testers to construct test cases from functional specifications. A Unified Modelling Language (UML) class diagram and Object Constraint Language (OCL) are used in this paper to represent the software specifications. Each classification and associated class in the software specification is represented by classes and attributes in the class diagram. Software specification relationships are represented by associated and hierarchical relationships in the class diagram. To ensure that relationships are consistent, an automatic methodology is proposed to capture and control the class relationships in a systematic way. This can help to reduce duplication and illegitimate test cases, which improves the testing efficiency and minimises the time and cost of the testing. The methodology introduced in this paper extracts only the legitimate test cases, by removing the duplicate test cases and those incomputable with the software specifications. Large amounts of time would have been needed to execute all of the test cases; therefore, a methodology was proposed which aimed to select a best testing path. This path guarantees the highest coverage of system units and avoids using all generated test cases. This path reduces the time and cost of the testing

CiteSeerX

Northumbria Research Link

A pragmatic approach for measuring data quality in primary care databases

Author: Beloff Natalia
Boggon Rachael
Dungey Sheena
Puri Shivani
Tate A Rosemary
Williams Tim
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

There is currently no widely recognised methodology for undertaking data quality assessment in electronic health records used for research. In an attempt to address this, we have developed a protocol for measuring and monitoring data quality in primary care research databases, whereby practice-based data quality measures are tailored to the intended use of the data. Our approach was informed by an in-depth investigation of aspects of data quality in the Clinical Practice Research Datalink Gold database and presentations of the results to data users. Although based on a primary care database, much of our proposed approach would be equally applicable to other health care databases

Crossref

Sussex Research Online

Design and implementation of a workflow for quality improvement of the metadata of scientific publications

Author: Wolff Stefan
Publication venue: Stefan Wolff
Publication date: 07/11/2023
Field of study

In this paper, a detailed workflow for analyzing and improving the quality of metadata of scientific publications is presented and tested. The workflow was developed based on approaches from the literature. Frequently occurring types of errors from the literature were compiled and mapped to the data-quality dimensions most relevant for publication data – completeness, correctness, and consistency – and made measurable. Based on the identified data errors, a process for improving data quality was developed. This process includes parsing hidden data, correcting incorrectly formatted attribute values, enriching with external data, carrying out deduplication, and filtering erroneous records. The effectiveness of the workflow was confirmed in an exemplary application to publication data from Open Researcher and Contributor ID (ORCID), with 56\% of the identified data errors corrected. The workflow will be applied to publication data from other source systems in the future to further increase its performance

Sächsische Landesbibliothek - Staats- und Universitätsbibliothek Dresden (SLUB): Qucosa