Search CORE

24,110 research outputs found

Improving the Performance and Endurance of Persistent Memory with Loose-Ordering Consistency

Author: Lu Youyou
Mutlu Onur
Shu Jiwu
Sun Long
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/05/2017
Field of study

Persistent memory provides high-performance data persistence at main memory. Memory writes need to be performed in strict order to satisfy storage consistency requirements and enable correct recovery from system crashes. Unfortunately, adhering to such a strict order significantly degrades system performance and persistent memory endurance. This paper introduces a new mechanism, Loose-Ordering Consistency (LOC), that satisfies the ordering requirements at significantly lower performance and endurance loss. LOC consists of two key techniques. First, Eager Commit eliminates the need to perform a persistent commit record write within a transaction. We do so by ensuring that we can determine the status of all committed transactions during recovery by storing necessary metadata information statically with blocks of data written to memory. Second, Speculative Persistence relaxes the write ordering between transactions by allowing writes to be speculatively written to persistent memory. A speculative write is made visible to software only after its associated transaction commits. To enable this, our mechanism supports the tracking of committed transaction ID and multi-versioning in the CPU cache. Our evaluations show that LOC reduces the average performance overhead of memory persistence from 66.9% to 34.9% and the memory write traffic overhead from 17.1% to 3.4% on a variety of workloads.Comment: This paper has been accepted by IEEE Transactions on Parallel and Distributed System

arXiv.org e-Print Archive

Crossref

Pedestal and Er profile evolution during an edge localized mode cycle at ASDEX Upgrade

Author: Burckhart A.
Cavedon M.
Dunne M.G.
Fischer R.
Laggner F.M.
Lebschy A.
Mink F.
Pütterich T.
Stroth U.
Viezzer Eleonora
Willensdorfer M.
Wolfrum E.
Publication venue: 'IOP Publishing'
Publication date: 01/01/2017
Field of study

The upgrade of the edge charge exchange recombination spectroscopy diagnostic at ASDEX Upgrade has enabled highly spatially resolved me asurements of the impurity ion dynamics during an edge-localized mode cycle ( ELM ) with unprecedented temp oral resolution, i.e. 65 μ s. The increase of transport during an ELM induces a relaxation of the ion, electron edge gradients in impurity density and fl ows. Detailed characterization of the recovery of the edge temperature gradients reveals a difference in the ion and electron channe l: the maximum ion temperature gradient T i is re-established on similar timescales as n e , which is faster than the recovery of T e .Afterthe clamping of the maximum gradient, T i and T e at the pedestal top continue to rise up to the next ELM while n e stays constant which means that the temperatur e pedestal and the resu lting pedestal pressure widen until the next ELM. The edge radial electric fi eld E r at the ELM crash is found to reduce to typical L-mode values and its ma ximum recovers to its pre-ELM conditions on a similar time scale as for n e and T i . Within the uncertainties, the measurements of E r align with their neoclassical predictions E r,neo for most of the ELM cycle, thus indicating that E r is dominated by collisional processes. However, between 2 and 4 ms af ter the ELM crash, other contributions to E B ́ fl ow, e.g. zonal fl ows or ion orbit effects, could not be excluded within the uncertainties.European Commission (EUROfusion 633053

MPG.PuRe

idUS. Depósito de Investigación Universidad de Sevilla

Flexible Rollback Recovery in Dynamic Heterogeneous Grid Computing

Author: Axel Krings
Samir Jafar
Senior Member
Thierry Gautier
Publication venue
Publication date
Field of study

Abstract—Large applications executing on Grid or cluster architectures consisting of hundreds or thousands of computational nodes create problems with respect to reliability. The source of the problems are node failures and the need for dynamic configuration over extensive runtime. This paper presents two fault-tolerance mechanisms called Theft-Induced Checkpointing and Systematic Event Logging. These are transparent protocols capable of overcoming problems associated with both benign faults, i.e., crash faults, and node or subnet volatility. Specifically, the protocols base the state of the execution on a dataflow graph, allowing for efficient recovery in dynamic heterogeneous systems as well as multithreaded applications. By allowing recovery even under different numbers of processors, the approaches are especially suitable for applications with a need for adaptive or reactionary configuration control. The low-cost protocols offer the capability of controlling or bounding the overhead. A formal cost model is presented, followed by an experimental evaluation. It is shown that the overhead of the protocol is very small, and the maximum work lost by a crashed process is small and bounded. Index Terms—Grid computing, rollback recovery, checkpointing, event logging. Ç

CiteSeerX

A Reliable Instant Messenger in Erlang: Design and Evaluation

Author: Chechina Natalia
Hernandez Mario Moro
Trinder Phil
Publication venue: Glasgow University
Publication date: 17/12/2015
Field of study

This document describes the design and evaluation of two Erlang-based instant messenger systems using Distributed Erlang (D-Erlang) and Scalable Distributed Erlang (SD-Erlang). The purpose of these systems is to serve as real-world benchmarks to test the performance of the SD Erlang library

Enlighten

Recommended from our members

Fault tolerance via diversity for off-the-shelf products: A study with SQL database servers

Author: Gashi I.
Popov P. T.
Strigini L.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2007
Field of study

If an off-the-shelf software product exhibits poor dependability due to design faults, then software fault tolerance is often the only way available to users and system integrators to alleviate the problem. Thanks to low acquisition costs, even using multiple versions of software in a parallel architecture, which is a scheme formerly reserved for few and highly critical applications, may become viable for many applications. We have studied the potential dependability gains from these solutions for off-the-shelf database servers. We based the study on the bug reports available for four off-the-shelf SQL servers plus later releases of two of them. We found that many of these faults cause systematic noncrash failures, which is a category ignored by most studies and standard implementations of fault tolerance for databases. Our observations suggest that diverse redundancy would be effective for tolerating design faults in this category of products. Only in very few cases would demands that triggered a bug in one server cause failures in another one, and there were no coincident failures in more than two of the servers. Use of different releases of the same product would also tolerate a significant fraction of the faults. We report our results and discuss their implications, the architectural options available for exploiting them, and the difficulties that they may present

City Research Online

Crossref

Implementing Performance Competitive Logical Recovery

Author: Lomet David
Tzoumas Kostas
Zwilling Michael
Publication venue
Publication date: 01/01/2010
Field of study

New hardware platforms, e.g. cloud, multi-core, etc., have led to a reconsideration of database system architecture. Our Deuteronomy project separates transactional functionality from data management functionality, enabling a flexible response to exploiting new platforms. This separation requires, however, that recovery is described logically. In this paper, we extend current recovery methods to work in this logical setting. While this is straightforward in principle, performance is an issue. We show how ARIES style recovery optimizations can work for logical recovery where page information is not captured on the log. In side-by-side performance experiments using a common log, we compare logical recovery with a state-of-the art ARIES style recovery implementation and show that logical redo performance can be competitive.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

VBN