3,173 research outputs found

    Improving the Performance and Endurance of Persistent Memory with Loose-Ordering Consistency

    Full text link
    Persistent memory provides high-performance data persistence at main memory. Memory writes need to be performed in strict order to satisfy storage consistency requirements and enable correct recovery from system crashes. Unfortunately, adhering to such a strict order significantly degrades system performance and persistent memory endurance. This paper introduces a new mechanism, Loose-Ordering Consistency (LOC), that satisfies the ordering requirements at significantly lower performance and endurance loss. LOC consists of two key techniques. First, Eager Commit eliminates the need to perform a persistent commit record write within a transaction. We do so by ensuring that we can determine the status of all committed transactions during recovery by storing necessary metadata information statically with blocks of data written to memory. Second, Speculative Persistence relaxes the write ordering between transactions by allowing writes to be speculatively written to persistent memory. A speculative write is made visible to software only after its associated transaction commits. To enable this, our mechanism supports the tracking of committed transaction ID and multi-versioning in the CPU cache. Our evaluations show that LOC reduces the average performance overhead of memory persistence from 66.9% to 34.9% and the memory write traffic overhead from 17.1% to 3.4% on a variety of workloads.Comment: This paper has been accepted by IEEE Transactions on Parallel and Distributed System

    Objcache: An Elastic Filesystem over External Persistent Storage for Container Clusters

    Full text link
    Container virtualization enables emerging AI workloads such as model serving, highly parallelized training, machine learning pipelines, and so on, to be easily scaled on demand on the elastic cloud infrastructure. Particularly, AI workloads require persistent storage to store data such as training inputs, models, and checkpoints. An external storage system like cloud object storage is a common choice because of its elasticity and scalability. To mitigate access latency to external storage, caching at a local filesystem is an essential technique. However, building local caches on scaling clusters must cope with explosive disk usage, redundant networking, and unexpected failures. We propose objcache, an elastic filesystem over external storage. Objcache introduces an internal transaction protocol over Raft logging to enable atomic updates of distributed persistent states with consistent hashing. The proposed transaction protocol can also manage inode dirtiness by maintaining the consistency between the local cache and external storage. Objcache supports scaling down to zero by automatically evicting dirty files to external storage. Our evaluation reports that objcache speeded up model serving startup by 98.9% compared to direct copies via S3 interfaces. Scaling up with dirty files completed from 2 to 14 seconds with 1024 dirty files.Comment: 13 page

    Attributes of fault-tolerant distributed file systems

    Get PDF
    Fault tolerance in distributed file systems will be investigated by analyzing recovery techniques and concepts implemented within the following models of distributed systems: pool-processor model and user-server model. The research presented provides an overview of fault tolerance characteristics and mechanisms within current implementations and summarizes future directions for fault tolerant distributed file systems

    A support architecture for reliable distributed computing systems

    Get PDF
    The Clouds kernel design was through several design phases and is nearly complete. The object manager, the process manager, the storage manager, the communications manager, and the actions manager are examined

    SAFIUS - A secure and accountable filesystem over untrusted storage

    Get PDF
    We describe SAFIUS, a secure accountable file system that resides over an untrusted storage. SAFIUS provides strong security guarantees like confidentiality, integrity, prevention from rollback attacks, and accountability. SAFIUS also enables read/write sharing of data and provides the standard UNIX-like interface for applications. To achieve accountability with good performance, it uses asynchronous signatures; to reduce the space required for storing these signatures, a novel signature pruning mechanism is used. SAFIUS has been implemented on a GNU/Linux based system modifying OpenGFS. Preliminary performance studies show that SAFIUS has a tolerable overhead for providing secure storage: while it has an overhead of about 50% of OpenGFS in data intensive workloads (due to the overhead of performing encryption/decryption in software), it is comparable (or better in some cases) to OpenGFS in metadata intensive workloads.Comment: 11pt, 12 pages, 16 figure

    Mechanisms for improving ZooKeeper Atomic Broadcast performance

    Get PDF
    PhD ThesisCoordination services are essential for building higher-level primitives that are often used in today’s data-center infrastructures, as they greatly facilitate the operation of distributed client applications. Examples of typical functionalities offered by coordination services include the provision of group membership, support for leader election, distributed synchronization, as well as reliable low-volume storage and naming. To provide reliable services to the client applications, coordination services in general are replicated for fault tolerance and should deliver high performance to ensure that they do not become bottlenecks for dependent applications. Apache ZooKeeper, for example, is a well-known coordination service and applies a primary-backup approach in which the leader server processes all state-modifying requests and then forwards the corresponding state updates to a set of follower servers using an atomic broadcast protocol called Zab. Having analyzed state-of-the-art coordination services, we identified two main limitations that prevent existing systems such as Apache ZooKeeper from achieving a higher write performance: First, while this approach prevents the data stored by client applications from being lost as a result of server crashes, it also comes at the cost of a performance penalty. In particular, the fact that it relies on a leader-based protocol, means that its performance becomes bottlenecked when the leader server has to handle an increased message traffic as the number of client requests and replicas increases. Second, Zab requires significant communication between instances (as it entails three communication steps). This can potentially lead to performance overhead and uses up more computer resources, resulting in less guarantees for users who must then build more complex applications to handle these issues. To this end, the work makes four contributions. First, we implement ZooKeeper atomic broadcast, extracting from ZooKeeper in order to make it easier for other developers to build their applications on top of Zab without the complexity of integrating the entire ZooKeeper codebase. Second, we propose three variations of Zab, which are all capable of reaching an agreement in fewer communication steps than Zab. The v variations are built with restriction assumptions that server crashes are independent and a server quorum remains operative at all times. The first variation offers excellent performance but can only be used for 3-server systems; the other two are built without this limitation. Then, we redesigned the latest two Zab variations to operate under the least-restricted Zab fault assumptions. Third, we design and implement a ZooKeeper coin-tossing protocol, called ZabCT which addresses the above concerns by having the other, non-leader server replicas toss a coin and broadcast their acknowledgment of a leader’s proposal only if the toss results in an outcome of Head. We model the ZabCT process and derive analytical expressions for estimating the coin-tossing probability of Head for a given arrival rate of service requests such that the dual objectives of performance gains and traffic reduction can be accomplished. If a coin-tossing protocol, ZabCT is judged not to offer performance benefits over Zab, processes should be able to switch autonomously to Zab. We design protocol switching by letting processes switch between ZabCT and Zab without stopping message delivery. Finally, an extensive performance evaluation is provided for Zab and Zab-variant protocols

    Review of Some Transaction Models used in Mobile Databases

    Get PDF
    Mobile computing is presently experiencing a period of unprecedented growth with the convergence of communication and computing capabilities of mobile phones and personal digital assistant. However, mobile computing presents many inherent problems that lead to poor network connectivity. To overcome poor connectivity and reduce cost, mobile clients are forced to operate in disconnected and partially connected modes. One of the main goals of mobile data access is to reach the ubiquity inherent to the mobile systems: to access information regardless of time and place. Due to mobile systems restrictions such as, for instance, limited memory and narrow bandwidth, it is only natural that researchers expend efforts to soothe such issues. This work approaches the issues regarding the cache management in mobile databases, with emphasis in techniques to reduce cache faults while the mobile device is either connected, or with a narrow bandwidth, or disconnected at all. Thus, it is expected improve data availability while a disconnection. Here in the paper, we try to describe various mobile transaction models, focusing on versatile data sharing mechanisms in volatile mobile environments
    corecore