Search CORE

108,726 research outputs found

Perbandingan Distributed Replicated dengan Striped Replicated untuk Mereplika File dalam GlusterFS pada Computer Cluster

Author: Fandy Mahendra
Publication venue: Universitas Telkom
Publication date: 01/01/2014
Field of study

ABSTRAKSI: Tingkat ketersediaan file pada server di dalam suatu jaringan menjadi tuntutan yang harus selalu dipenuhi. Para pengguna tentunya ingin selalu dapat mengakses file kapan pun. Padahal apabila terjadi suatu gangguan, misalnya server down, maka file tidak dapat diakses. Maka dari itu dibutuhkan adanya replikasi file untuk mengatasi permasalahan tersebut. Jadi, ketika terjadi server down, maka pengguna tetap dapat mengakses file yang mereka butuhkan. File yang ada pada server direplikasi ke server-server lain. Semua server tersebut tergabung di dalam suatu Cluster, yang kemudian disebut dengan computer Cluster. Dan salah satu file system yang dapat digunakan untuk mereplikasi file pada computer Cluster adalah GlusterFS.GlusterFS memiliki beberapa metode untuk mereplikasi file, diantaranya adalah Distributed Replicated dan Striped Replicated. Pada Distributed Replicated, file didistribusikan ke server-server yang berada di dalam satu Cluster. Sedangkan Striped Replicated memiliki satu tahap yang lebih banyak, yaitu sebelum mendistribusikan file ke server-server yang berada di satu Cluster, file tersebut dipecah-pecah terlebih dahulu. Dalam penelitian kali ini, yang dianalisis adalah tingkat efektifitas dan efisiensi dari kedua metode tersebut pada saat proses replikasi file berlangsung.Hasil dari penelitian ini, kedua metode tersebut sama-sama efektif dalam mereplikasi file. Sedangkan dari tingkat efisiensi, secara keseluruhan, Distributed Replicated lebih efisien daripada Striped Replicated.Kata Kunci : GlusterFS, replikasi, Distributed Replicated, Striped Replicated, ClusterABSTRACT: File availability is a requirement that must be fulfilled. Users always want to access their files whenever they want. Although there are problems occur, for an example, server down, so the files cannot be acessed. Thus, data replication is needed to solve the problem. So whenever the server is down, users still can access their files. Those files are replicated to other servers. All these servers are connencted each other, create a new system called computer Cluster. One of the file system that support file replication in computer Cluster is GlusterFS.GlusterFS has some file replication methods, such as Distributed Replicated and Striped Replicated. In the Distributed Replicated algorithm, the files are distributed to other servers that connected to a Cluster, on the other hand, Striped Replicated has further step before distribute the files to another servers, each file is splitted at first. In this research, the performance values that being analized are effectifity and efficiency rate from those two algorithms when doing file replication with GlusterFS.And the result of this research, these two algorithms both are effective to replicate the files. Based on the efficiency rate, the result shows that Distributed Replicated has more efficient than Striped Replicated.Keyword: GlusterFS, replication, Distributed Replicated, Striped Replicated, Cluste

Open Library

Recommended from our members

Building Distributed Systems with Non-Volatile Main Memories and RDMA Networks

Author: Yang Jian
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

High-performance, byte-addressable non-volatile main memories (NVMMs) allow application developers to combine storage and memory into a single layer. These high-performance storage systems would be especially useful in large-scale data center environments where data is distributed and replicated across multiple servers.Unfortunately, existing approaches of providing remote storage access rest on the assumption that storage is slow, so the cost of the software and protocols is acceptable. Such assumption no longer holds for the fast NVMM. As a result, taking full advantage of NVMMs’ potential will require changes in system software and networking protocol. This thesis focuses on accessing remote NVMM efficiently using remote direct memory access (RDMA) network. RDMA enables a client to directly access memory on a remote machine without involving its local CPU.This thesis first presents Mojim, a system that provides replicated, reliable, and highly-available NVMM as an operating system service. Applications can access data in Mojim using normal load and store instructions while controlling when and how updates propagate to replicas using system calls. Our evaluation shows Mojim adds little overhead to the un-replicated system and provides 0.4x to 2.7x the throughput of the un-replicated system.This thesis then presents Orion, a distributed file system designed from for NVMM and RDMA networks. Traditional distributed file systems are designed for slower hard drives. These slower media incentivizes complex optimizations (e.g., queuing, striping, and batching) around disk accesses. Orion combines file system functions and network operations into a single layer. It provides low latency metadata accesses and outperforms existing distributed file systems by a large margin.Finally, an NVMM application can map files backed by an NVMM file system into its address space, and accesses them using CPU instructions. In this case, RDMA and NVMM file systems introduce duplication of effort on permissions, naming, and address translation. We introduce two changes to the existing RDMA protocol: the file memory region (FileMR) and range based address translation. By eliminating redundant translations, FileMR minimizes the number of translations done at the NIC, reducing the load on the NIC’s translation cache and resulting in application performance improvement by 1.8x - 2.0x

eScholarship - University of California

ISIS and META projects

Author: Birman Kenneth
Cooper Robert
Marzullo Keith
Publication venue
Publication date
Field of study

The ISIS project has developed a new methodology, virtual synchony, for writing robust distributed software. High performance multicast, large scale applications, and wide area networks are the focus of interest. Several interesting applications that exploit the strengths of ISIS, including an NFS-compatible replicated file system, are being developed. The META project is distributed control in a soft real-time environment incorporating feedback. This domain encompasses examples as diverse as monitoring inventory and consumption on a factory floor, and performing load-balancing on a distributed computing system. One of the first uses of META is for distributed application management: the tasks of configuring a distributed program, dynamically adapting to failures, and monitoring its performance. Recent progress and current plans are reported

NASA Technical Reports Server

High Performance Fault-Tolerant Hadoop Distributed File System

Author: Yelakala Pragna
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/05/2017
Field of study

The Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably, and to stream those data sets at high bandwidth to user applications. Huge amounts of data generated from many sources daily. Maintenance of such data is a challenging task. One proposing solution is to use Hadoop. The solution provided by Google, ?Doug Cutting? and his team developed an Open Source Project called Hadoop. Hadoop is a framework written in Java for running applications on large clusters of commodity hardware. The Hadoop Distributed File System (HDFS) is designed to be scalable, fault-tolerant, distributed storage system. Hadoop?s HDFS is a highly fault-tolerant distributed file system and, like Hadoop in general, designed to be deployed on low-cost hardware. The HDFS stores filesystem Metadata and application data separately. HDFS stores Metadata on separate dedicated server called NameNode and application data stored on separate servers called DataNodes. The file system data is accessed via HDFS clients, which first contact the NameNode data location and then transfer data to (write) or from (read) the specified DataNodes. Download file request chooses only one of the servers to download. Other replicated servers are not used. As the file size increases the download time increases. In this paper we work on three policies for selection of blocks. Those are first, random and loadbased. By observing the results the speed of download time for file is ?first? runs slower than ?random? and ?random? runs slower than ?loadbased?

International Journal on Recent and Innovation Trends in Computing and Communication

File system on CRDT

Author: Ahmed-Nacer Mehdi
Martin Stéphane
Urso Pascal
Publication venue
Publication date: 25/07/2012
Field of study

In this report we show how to manage a distributed hierarchical structure representing a file system. This structure is optimistically replicated, each user work on his local replica, and updates are sent to other replica. The different replicas eventually observe same view of file systems. At this stage, conflicts between updates are very common. We claim that conflict resolution should rely as little as possible on users. In this report we propose a simple and modular solution to resolve these problems and maintain data consistency

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

What\u27s in Unison? A Formal Specification and Reference Implementation of a File Synchronizer

Author: Pierce Benjamin C
Vouillon Jerome
Publication venue: ScholarlyCommons
Publication date: 24/02/2004
Field of study

A file synchronizer is a tool that reconciles disconnected modifications to a replicated directory structure. Trustworthy synchronizers are difficult to build, since they must deal correctly with both the semantic complexities of file systems and the unpredictable failure modes arising from distributed operation. On the other hand, synchronizers are often packaged as stand-alone, user-level utilities, whose intended behavior is relatively easy to isolate from the other functions of the system. This combination of subtlety and isolability makes file synchronizers attractive candidates for precise mathematical specification. We present here a detailed specification of a particular file synchronizer called Unison, sketch an idealized reference implementation of our specification, and discuss the relation between our idealized implementation and the actual code base

ScholarlyCommons@Penn

Detecting corrupted pages in M replicated large files

Author: Hwang FK
Zang W
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1997
Field of study

A file in a distributed database system is replicated on M sites and may contain corrupted pages. Abdel-Ghaffar and El Abbadi gave a detection scheme assuming that the number of corrupted pages f < M l 2. We replace this assumption by a much weaker one, that, for each page, the majority of copies are correct. Our schemes are based on the structure of the Reed-Solomon code, as proposed by Abdel-Ghaffar and El Abbadi for M= 2. © 1997 IEEE.published_or_final_versio

HKU Scholars Hub

Multicast communications in distributed systems

Author: Hughes Frank Lawrence Kingsman
Publication venue: Newcastle University
Publication date: 01/01/1986
Field of study

PhD ThesisOne of the numerous results of recent developments in communication networks and distributed systems has been an increased interest in the study of applications and protocolsfor communications between multiple, as opposed to single, entities such as processes and computers. For example, in replicated file storage, a process attempts to store a file on several file servers, rather than one. MUltiple entity communications, which allow one-to-many and many-to-one communications, are known as multicast communications. This thesis examines some of the ways in which the architectures of computer networks and distributed systems can affect the design and development of multicast communication applications and protocols.To assist in this examination, the thesis presents three contributions. First, a set of classification schemes are developed for use in the description and analysis of various multicast communication strategies. Second, a general set of multicast communication primitives are presented, unrelated to any specific network or distributed system, yet efficiently implementable on a variety of networks. Third, the primitives are used to obtain experimental results for a study ofintranetwork and internetwork multicast communications.Postgraduate Scholarship, The Natural Sciences and Engineering Research Council of Canada: Overseas Research Student Award: the Committee of Vice-Chancellors and Principals of the Universities of the Uni ted Kingdom

Newcastle University eTheses

Assise: Performance and Availability via NVM Colocation in a Distributed File System

Author: Anderson Thomas E.
Canini Marco
Kim Jongyul
Kostić Dejan
Kwon Youngjin
Peter Simon
Reda Waleed
Schuh Henry N.
Witchel Emmett
Publication venue
Publication date: 01/01/2020
Field of study

The adoption of very low latency persistent memory modules (PMMs) upends the long-established model of disaggregated file system access. Instead, by colocating computation and PMM storage, we can provide applications much higher I/O performance, sub-second application failover, and strong consistency. To demonstrate this, we built the Assise distributed file system, based on a persistent, replicated coherence protocol for managing a set of server-colocated PMMs as a fast, crash-recoverable cache between applications and slower disaggregated storage, such as SSDs. Unlike disaggregated file systems, Assise maximizes locality for all file IO by carrying out IO on colocated PMM whenever possible and minimizes coherence overhead by maintaining consistency at IO operation granularity, rather than at fixed block sizes. We compare Assise to Ceph/Bluestore, NFS, and Octopus on a cluster with Intel Optane DC PMMs and SSDs for common cloud applications and benchmarks, such as LevelDB, Postfix, and FileBench. We find that Assise improves write latency up to 22x, throughput up to 56x, fail-over time up to 103x, and scales up to 6x better than its counterparts, while providing stronger consistency semantics. Assise promises to beat the MinuteSort world record by 1.5x

arXiv.org e-Print Archive

Publikationer från KTH

Digitala Vetenskapliga Arkivet - Academic Archive On-line