16 research outputs found
An empirical comparison of the security and performance characteristics of topology formation algorithms for Bitcoin networks
There is an increasing demand for digital crypto-currencies to be more secure and robust to meet the following business requirements: (1) low transaction fees and (2) the privacy of users. Nowadays, Bitcoin is gaining traction and wide adoption. Many well-known businesses have begun accepting bitcoins as a means of making financial payments. However, the susceptibility of Bitcoin networks to information propagation delay, increases the vulnerability to attack of the Bitcoin network, and decreases its throughput performance. This paper introduces and critically analyses new network clustering methods, named Locality Based Clustering (LBC), Ping Time Based Approach (PTBC), Super Node Based Clustering (SNBA), and Master Node Based Clustering (MNBC). The proposed methods aim to decrease the chances of performing a successful double spending attack by reducing the information propagation delay of Bitcoin. These methods embody proximity-aware extensions to the standard Bitcoin protocol, where proximity is measured geographically and in terms of latency. We validate our proposed methods through a set of simulation experiments and the findings show how the proposed methods run and their impact in optimising the transaction propagation delay. Furthermore, these new methods are evaluated from the perspective of the Bitcoin network’s resistance to partitioning attacks. Numerical results, which are established via extensive simulation experiments, demonstrate how the extensions run and also their impact in optimising the transaction propagation delay. We draw on these findings to suggest promising future research directions for the optimisation of transaction propagation delays
Approximate algorithms for efficient indexing, clustering, and classification in Peer-to-peer networks
[no abstract
SoS: self-organizing substrates
Large-scale networked systems often, both by design or chance exhibit self-organizing properties. Understanding self-organization using tools from cybernetics, particularly modeling them as Markov processes is a first step towards a formal framework which can be used in (decentralized) systems research and design.Interesting aspects to look for include the time evolution of a system and to investigate if and when a system converges to some absorbing states or stabilizes into a dynamic (and stable) equilibrium and how it performs under such an equilibrium state. Such a formal framework brings in objectivity in systems research, helping discern facts from artefacts as well as providing tools for quantitative evaluation of such systems. This thesis introduces such formalism in analyzing and evaluating peer-to-peer (P2P) systems in order to better understand the dynamics of such systems which in turn helps in better designs. In particular this thesis develops and studies the fundamental building blocks for a P2P storage system. In the process the design and evaluation methodology we pursue illustrate the typical methodological approaches in studying and designing self-organizing systems, and how the analysis methodology influences the design of the algorithms themselves to meet system design goals (preferably with quantifiable guarantees). These goals include efficiency, availability and durability, load-balance, high fault-tolerance and self-maintenance even in adversarial conditions like arbitrarily skewed and dynamic load and high membership dynamics (churn), apart of-course the specific functionalities that the system is supposed to provide. The functionalities we study here are some of the fundamental building blocks for various P2P applications and systems including P2P storage systems, and hence we call them substrates or base infrastructure. These elemental functionalities include: (i) Reliable and efficient discovery of resources distributed over the network in a decentralized manner; (ii) Communication among participants in an address independent manner, i.e., even when peers change their physical addresses; (iii) Availability and persistence of stored objects in the network, irrespective of availability or departure of individual participants from the system at any time; and (iv) Freshness of the objects/resources' (up-to-date replicas). Internet-scale distributed index structures (often termed as structured overlays) are used for discovery and access of resources in a decentralized setting. We propose a rapid construction from scratch and maintenance of the P-Grid overlay network in a self-organized manner so as to provide efficient search of both individual keys as well as a whole range of keys, doing so providing good load-balancing characteristics for diverse kind of arbitrarily skewed loads - storage and replication, query forwarding and query answering loads. For fast overlay construction we employ recursive partitioning of the key-space so that the resulting partitions are balanced with respect to storage load and replication. The proper algorithmic parameters for such partitioning is derived from a transient analysis of the partitioning process which has Markov property. Preservation of ordering information in P-Grid such that queries other than exact queries, like range queries can be efficiently and rather trivially handled makes P-Grid suitable for data-oriented applications. Fast overlay construction is analogous to building an index on a new set of keys making P-Grid suitable as the underlying indexing mechanism for peer-to-peer information retrieval applications among other potential applications which may require frequent indexing of new attributes apart regular updates to an existing index. In order to deal with membership dynamics, in particular changing physical address of peers across sessions, the overlay itself is used as a (self-referential) directory service for maintaining the participating peers' physical addresses across sessions. Exploiting this self-referential directory, a family of overlay maintenance scheme has been designed with lower communication overhead than other overlay maintenance strategies. The notion of dynamic equilibrium study for overlays under continuous churn and repairs, modeled as a Markov process, was introduced in order to evaluate and compare the overlay maintenance schemes. While the self-referential directory was originally invented to realize overlay maintenance schemes with lower overheads than existing overlay maintenance schemes, the self-referential directory is generic in nature and can be used for various other purposes, e.g., as a decentralized public key infrastructure. Persistence of peer identity across sessions, in spite of changes in physical address, provides a logical independence of the overlay network from the underlying physical network. This has many other potential usages, for example, efficient maintenance mechanisms for P2P storage systems and P2P trust and reputation management. We specifically look into the dynamics of maintaining redundancy for storage systems and design a novel lazy maintenance strategy. This strategy is algorithmically a simple variant of existing maintenance strategies which adapts to the system dynamics. This randomized lazy maintenance strategy thus explores the cost-performance trade-offs of the storage maintenance operations in a self-organizing manner. We model the storage system (redundancy), under churn and maintenance, as a Markov process. We perform an equilibrium study to show that the system operates in a more stable dynamic equilibrium with our strategy than for the existing maintenance scheme for comparable overheads. Particularly, we show that our maintenance scheme provides substantial performance gains in terms of maintenance overhead and system's resilience in presence of churn and correlated failures. Finally, we propose a gossip mechanism which works with lower communication overhead than existing approaches for communication among a relatively large set of unreliable peers without assuming any specific structure for their mutual connectivity. We use such a communication primitive for propagating replica updates in P2P systems, facilitating management of mutable content in P2P systems. The peer population affected by a gossip can be modeled as a Markov process. Studying the transient spread of gossips help in choosing proper algorithm parameters to reduce communication overhead while guaranteeing coverage of online peers. Each of these substrates in themselves were developed to find practical solutions for real problems. Put together, these can be used in other applications, including a P2P storage system with support for efficient lookup and inserts, membership dynamics, content mutation and updates, persistence and availability. Many of the ideas have already been implemented in real systems and several others are in the way to be integrated into the implementations. There are two principal contributions of this dissertation. It provides design of the P2P systems which are useful for end-users as well as other application developers who can build upon these existing systems. Secondly, it adapts and introduces the methodology of analysis of a system's time-evolution (tools typically used in diverse domains including physics and cybernetics) to study the long run behavior of P2P systems, and uses this methodology to (re-)design appropriate algorithms and evaluate them. We observed that studying P2P systems from the perspective of complex systems reveals their inner dynamics and hence ways to exploit such dynamics for suitable or better algorithms. In other words, the analysis methodology in itself strongly influences and inspires the way we design such systems. We believe that such an approach of orchestrating self-organization in internet-scale systems, where the algorithms and the analysis methodology have strong mutual influence will significantly change the way future such systems are developed and evaluated. We envision that such an approach will particularly serve as an important tool for the nascent but fast moving P2P systems research and development community
Trade-off among timeliness, messages and accuracy for large-Ssale information management
The increasing amount of data and the number of nodes in large-scale environments
require new techniques for information management. Examples of such environments
are the decentralized infrastructures of Computational Grid and Computational
Cloud applications. These large-scale applications need different kinds
of aggregated information such as resource monitoring, resource discovery or economic
information. The challenge of providing timely and accurate information
in large scale environments arise from the distribution of the information. Reasons
for delays in distributed information system are a long information transmission
time due to the distribution, churn and failures.
A problem of large applications such as peer-to-peer (P2P) systems is the increasing
retrieval time of the information due to the decentralization of the data
and the failure proneness. However, many applications need a timely information
provision. Another problem is an increasing network consumption when the application
scales to millions of users and data. Using approximation techniques allows
reducing the retrieval time and the network consumption. However, the usage of
approximation techniques decreases the accuracy of the results. Thus, the remaining
problem is to offer a trade-off in order to solve the conflicting requirements of
fast information retrieval, accurate results and low messaging cost.
Our goal is to reach a self-adaptive decision mechanism to offer a trade-off
among the retrieval time, the network consumption and the accuracy of the result.
Self-adaption enables distributed software to modify its behavior based on
changes in the operating environment. In large-scale information systems that use
hierarchical data aggregation, we apply self-adaptation to control the approximation
used for the information retrieval and reduces the network consumption and
the retrieval time. The hypothesis of the thesis is that approximation techniquescan reduce the retrieval time and the network consumption while guaranteeing an
accuracy of the results, while considering user’s defined priorities.
First, this presented research addresses the problem of a trade-off among a
timely information retrieval, accurate results and low messaging cost by proposing
a summarization algorithm for resource discovery in P2P-content networks.
After identifying how summarization can improve the discovery process, we propose
an algorithm which uses a precision-recall metric to compare the accuracy
and to offer a user-driven trade-off. Second, we propose an algorithm that applies
a self-adaptive decision making on each node. The decision is about the pruning
of the query and returning the result instead of continuing the query. The pruning
reduces the retrieval time and the network consumption at the cost of a lower accuracy
in contrast to continuing the query. The algorithm uses an analytic hierarchy
process to assess the user’s priorities and to propose a trade-off in order to satisfy
the accuracy requirements with a low message cost and a short delay.
A quantitative analysis evaluates our presented algorithms with a simulator,
which is fed with real data of a network topology and the nodes’ attributes. The
usage of a simulator instead of the prototype allows the evaluation in a large scale
of several thousands of nodes. The algorithm for content summarization is evaluated
with half a million of resources and with different query types. The selfadaptive
algorithm is evaluated with a simulator of several thousands of nodes
that are created from real data. A qualitative analysis addresses the integration
of the simulator’s components in existing market frameworks for Computational
Grid and Cloud applications.
The proposed content summarization algorithm reduces the information retrieval
time from a logarithmic increase to a constant factor. Furthermore, the
message size is reduced significantly by applying the summarization technique.
For the user, a precision-recall metric allows defining the relation between the retrieval
time and the accuracy. The self-adaptive algorithm reduces the number of
messages needed from an exponential increase to a constant factor. At the same
time, the retrieval time is reduced to a constant factor under an increasing number
of nodes. Finally, the algorithm delivers the data with the required accuracy
adjusting the depth of the query according to the network conditions.La gestió de la informació exigeix noves tècniques que tractin amb la creixent
quantitat de dades i nodes en entorns a gran escala. Alguns exemples d’aquests
entorns són les infraestructures descentralitzades de Computacional Grid i Cloud.
Les aplicacions a gran escala necessiten diferents classes d’informació agregada
com monitorització de recursos i informació econòmica. El desafiament de proporcionar
una provisió rà pida i acurada d’informació en ambients de grans escala
sorgeix de la distribució de la informació. Una raó és que el sistema d’informació
ha de tractar amb l’adaptabilitat i fracassos d’aquests ambients.
Un problema amb aplicacions molt grans com en sistemes peer-to-peer (P2P)
és el creixent temps de recuperació de l’informació a causa de la descentralització
de les dades i la facilitat al fracà s. No obstant això, moltes aplicacions necessiten
una provisió d’informació puntual. A més, alguns usuaris i aplicacions accepten
inexactituds dels resultats si la informació es reparteix a temps. A més i més, el
consum de xarxa creixent fa que sorgeixi un altre problema per l’escalabilitat del
sistema. La utilització de tècniques d’aproximació permet reduir el temps de recuperació
i el consum de xarxa. No obstant això, l’ús de tècniques d’aproximació
disminueix la precisió dels resultats. AixÃ, el problema restant és oferir un compromÃs
per resoldre els requisits en conflicte d’extracció de la informació rà pida,
resultats acurats i cost d’enviament baix.
El nostre objectiu és obtenir un mecanisme de decisió completament autoadaptatiu
per tal d’oferir el compromÃs entre temps de recuperació, consum de
xarxa i precisió del resultat. AutoadaptacÃo permet al programari distribuït modificar
el seu comportament en funció dels canvis a l’entorn d’operació. En sistemes
d’informació de gran escala que utilitzen agregació de dades jerà rquica,
l’auto-adaptació permet controlar l’aproximació utilitzada per a l’extracció de la informació i redueixen el consum de xarxa i el temps de recuperació. La hipòtesi
principal d’aquesta tesi és que els tècniques d’aproximació permeten reduir el
temps de recuperació i el consum de xarxa mentre es garanteix una precisió adequada
definida per l’usari.
La recerca que es presenta, introdueix un algoritme de sumarització de continguts
per a la descoberta de recursos a xarxes de contingut P2P. Després d’identificar
com sumarització pot millorar el procés de descoberta, proposem una mètrica que
s’utilitza per comparar la precisió i oferir un compromÃs definit per l’usuari. Després,
introduïm un algoritme nou que aplica l’auto-adaptació a un ordre per satisfer
els requisits de precisió amb un cost de missatge baix i un retard curt. Basat
en les prioritats d’usuari, l’algoritme troba automà ticament un compromÃs.
L’anà lisi quantitativa avalua els algoritmes presentats amb un simulador per
permetre l’evacuació d’uns quants milers de nodes. El simulador s’alimenta amb
dades d’una topologia de xarxa i uns atributs dels nodes reals. L’algoritme de
sumarització de contingut s’avalua amb mig milió de recursos i amb diferents
tipus de sol·licituds. L’anà lisi qualitativa avalua la integració del components del
simulador en estructures de mercat existents per a aplicacions de Computacional
Grid i Cloud. AixÃ, la funcionalitat implementada del simulador (com el procés
d’agregació i la query language) és comprovada per la integració de prototips.
L’algoritme de sumarització de contingut proposat redueix el temps d’extracció
de l’informació d’un augment logarÃtmic a un factor constant. A més, també permet
que la mida del missatge es redueix significativament. Per a l’usuari, una
precision-recall mètric permet definir la relació entre el nivell de precisió i el
temps d’extracció de la informació. Alhora, el temps de recuperació es redueix
a un factor constant sota un nombre creixent de nodes. Finalment, l’algoritme
reparteix les dades amb la precisió exigida i ajusta la profunditat de la sol·licitud
segons les condicions de xarxa. Els algoritmes introduïts són prometedors per ser
utilitzats per l’agregació d’informació en nous sistemes de gestió de la informació
de gran escala en el futur
Peer-to-peer update dissemination in browser-based networked virtual environments.
PhD ThesisNetworked Virtual Environments (NVEs) have always imposed strict requirements on
architectures for update dissemination (UD). Clients must maintain views that are as
synchronous and consistent as possible in order to achieve a level of user experience that
is tolerable for the user.
In recent times, the web browser has become a viable platform on which to deploy
these NVEs. Doing so adds another layer of challenges however. There is a distinct need
for systems that adapt to these constraints and exploit the characteristics of this new
context to achieve reliably high consistency between users for a range of use cases.
A promising approach is to carry forward the rich body of past research in peer-to-peer
(P2P) networks and apply this to the problem of UD in NVEs under the constraints of a
web browser. Making NVEs scalable through P2P networks is not a new concept, however
previous work has always been either too specific to a certain kind of NVE, or made
performance trade-offs that especially cannot work in a browser context. Furthermore,
in previous work on P2P NVEs, UD has always taken the backseat compared to object
management and distributed neighbour selection. The evaluation of these UD systems
have as a result been one-dimensional and overly simplifying.
In this work, we begin by surveying past UD solutions and evaluation methodologies.
We then capture NVE, browser, and network constraints, aided by the analysis of a rich
dataset of NVE network traces that we have collected, and draw out key observations
and challenges to develop the requirements for a feasible UD system. From there, we
illustrate the design and implementation of our P2P UD system for NVEs in great detail,
augmenting our system with novel architectural insights from the Software-Defined
Networking (SDN) space. Finally, we evaluate our system under a range of workloads,
test environments, and performance metrics to demonstrate that we have overcome these
challenges, as well as compare our method to other existing methods, which we have also
implemented and tested.
We hope that our contributions in research and resources (such as our taxonomies,
NVE analysis, UD system, browser library, workload datasets, and a benchmarking framework)
bring more structure as well as research and development opportunities to a relatively
niche sub-field
Solving key design issues for massively multiplayer online games on peer-to-peer architectures
Massively Multiplayer Online Games (MMOGs) are increasing in both popularity and
scale on the Internet and are predominantly implemented by Client/Server architectures.
While such a classical approach to distributed system design offers many benefits, it suffers
from significant technical and commercial drawbacks, primarily reliability and scalability
costs. This realisation has sparked recent research interest in adapting MMOGs
to Peer-to-Peer (P2P) architectures.
This thesis identifies six key design issues to be addressed by P2P MMOGs, namely
interest management, event dissemination, task sharing, state persistency, cheating mitigation,
and incentive mechanisms. Design alternatives for each issue are systematically
compared, and their interrelationships discussed. How well representative P2P MMOG
architectures fulfil the design criteria is also evaluated. It is argued that although P2P
MMOG architectures are developing rapidly, their support for task sharing and incentive
mechanisms still need to be improved.
The design of a novel framework for P2P MMOGs, Mediator, is presented. It employs a
self-organising super-peer network over a P2P overlay infrastructure, and addresses the
six design issues in an integrated system. The Mediator framework is extensible, as it
supports flexible policy plug-ins and can accommodate the introduction of new superpeer
roles. Key components of this framework have been implemented and evaluated
with a simulated P2P MMOG.
As the Mediator framework relies on super-peers for computational and administrative
tasks, membership management is crucial, e.g. to allow the system to recover from
super-peer failures. A new technology for this, namely Membership-Aware Multicast
with Bushiness Optimisation (MAMBO), has been designed, implemented and evaluated.
It reuses the communication structure of a tree-based application-level multicast
to track group membership efficiently. Evaluation of a demonstration application shows
i
that MAMBO is able to quickly detect and handle peers joining and leaving. Compared
to a conventional supervision architecture, MAMBO is more scalable, and yet incurs
less communication overheads. Besides MMOGs, MAMBO is suitable for other P2P
applications, such as collaborative computing and multimedia streaming.
This thesis also presents the design, implementation and evaluation of a novel task
mapping infrastructure for heterogeneous P2P environments, Deadline-Driven Auctions
(DDA). DDA is primarily designed to support NPC host allocation in P2P MMOGs, and
specifically in the Mediator framework. However, it can also support the sharing of computational
and interactive tasks with various deadlines in general P2P applications. Experimental
and analytical results demonstrate that DDA efficiently allocates computing
resources for large numbers of real-time NPC tasks in a simulated P2P MMOG with approximately
1000 players. Furthermore, DDA supports gaming interactivity by keeping
the communication latency among NPC hosts and ordinary players low. It also supports
flexible matchmaking policies, and can motivate application participants to contribute
resources to the system