12,491 research outputs found
Content-access QoS in peer-to-peer networks using a fast MDS erasure code
This paper describes an enhancement of content access Quality of Service in peer to peer (P2P) networks. The main idea is to use an erasure code to distribute the information over the peers. This distribution increases the users’ choice on disseminated encoded data and therefore statistically enhances the overall throughput of the transfer. A performance evaluation based on an original model using the results of a measurement campaign of sequential and parallel downloads in a real P2P network over Internet is presented. Based on a bandwidth distribution, statistical content-access QoS are guaranteed in function of both the content replication level in the network and the file dissemination strategies. A simple application in the context of media streaming is proposed. Finally, the constraints on the erasure code related to the proposed system are analysed and a new fast MDS erasure code is proposed, implemented and evaluated
A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing
Data Grids have been adopted as the platform for scientific communities that
need to share, access, transport, process and manage large data collections
distributed worldwide. They combine high-end computing technologies with
high-performance networking and wide-area storage management techniques. In
this paper, we discuss the key concepts behind Data Grids and compare them with
other data sharing and distribution paradigms such as content delivery
networks, peer-to-peer networks and distributed databases. We then provide
comprehensive taxonomies that cover various aspects of architecture, data
transportation, data replication and resource allocation and scheduling.
Finally, we map the proposed taxonomy to various Data Grid systems not only to
validate the taxonomy but also to identify areas for future exploration.
Through this taxonomy, we aim to categorise existing systems to better
understand their goals and their methodology. This would help evaluate their
applicability for solving similar problems. This taxonomy also provides a "gap
analysis" of this area through which researchers can potentially identify new
issues for investigation. Finally, we hope that the proposed taxonomy and
mapping also helps to provide an easy way for new practitioners to understand
this complex area of research.Comment: 46 pages, 16 figures, Technical Repor
FADI: a fault-tolerant environment for open distributed computing
FADI is a complete programming environment that serves the reliable execution of distributed application programs. FADI encompasses all aspects of modern fault-tolerant distributed computing. The built-in user-transparent error detection mechanism covers processor node crashes and hardware transient failures. The mechanism also integrates user-assisted error checks into the system failure model. The nucleus non-blocking checkpointing mechanism combined with a novel selective message logging technique delivers an efficient, low-overhead backup and recovery mechanism for distributed processes. FADI also provides means for remote automatic process allocation on the distributed system nodes
A Study of Implementation Methodologies for Distributed Real Time Collaboration
Collaboration drives our world and is almost unavoidable in the programming industry. From higher education to the top technological companies, people are working together to drive discovery and innovation. Software engineers must work with their peers to accomplish goals daily in their workplace. When working with others there are a variety of tools to choose from such as Google Docs, Google Colab and Overleaf. Each of the aforementioned collaborative tools utilizes the Operational Transform (OT) technique in order to implement their real time collaboration functionality. Operational transform is the technique seen amongst most if not all major collaborative tools in our industry today. However, there is another way of implementing real time collaboration through a data structure called Conflict-free Replicated Data Type (CRDT) which has made claims of superiority over OT. Previous studies have taken place with the focus on comparing the theory behind OT and CRDT\u27s, but as far as we know, there have not been studies which compare real time collaboration performance using an OT implementation versus a CRDT implementation in a popularly used product such as Google Docs or Overleaf.
Our work will focus on comparing OT and CRDT\u27s real time collaborative performance in Overleaf, an academic authorship tool, which allows for easy collaboration on academic and professional papers. Overleaf\u27s current published version implements real time collaboration using operational transform. This thesis will contribute an analysis of the current real time collaboration performance of operational transform in Overleaf, an implementation of CRDT\u27s for real time collaboration in Overleaf and an analysis of the performance of real time collaboration through the CRDT implementation in Overleaf. This thesis describes the main advantages and disadvantages of OT vs CRDTs, as well as, to our knowledge, the first results of a non-theoretical attempt at implementing CRDTs for handling document edits in a collaborative environment which was originally operating using an OT implementation
FMKe: A realistic benchmark for key-value stores
Standard benchmarks are essential tools to evaluate and compare database management
systems in terms of relevant semantic properties and performance. They provide the
means to evaluate a system with workloads that mimic real applications. Although a number
of realistic benchmarks already exist for relational database systems, the same cannot
be said for NoSQL databases. This latter class of data storage systems has become increasingly
relevant for geo-distributed systems, and this has led developers and researchers to
either rely on benchmarks that do not model realistic workloads or to adapt the aforementioned
benchmarks for relational databases to work for NoSQL databases, in a somewhat
ad-hoc fashion. Since these benchmarks assume an isolation and transactional model in
the database, they are inherently inadequate to evaluate NoSQL databases.
In this thesis, we propose a new benchmark that addresses the lack of realistic evaluation
tools for distributed key-value stores. We consider a workload that is based on
information we have acquired about a real world deployment of a large-scale application
that operates over a distributed key-value store, that is responsible for managing
patient prescriptions at a nation-wide level in Denmark. We design our benchmark to
be extensible to a wide range of distributed key-value storage systems and some relational
database systems with minimal effort for programmers, which only need to design
and implement specific data storage drivers to benchmark different alternatives. We further
present a study on the performance of multiple database management systems in
different deployment scenarios
- …