18,098 research outputs found
DOH: A Content Delivery Peer-to-Peer Network
Many SMEs and non-pro¯t organizations su®er when their Web
servers become unavailable due to °ash crowd e®ects when their web site
becomes popular. One of the solutions to the °ash-crowd problem is to place
the web site on a scalable CDN (Content Delivery Network) that replicates
the content and distributes the load in order to improve its response time.
In this paper, we present our approach to building a scalable Web Hosting
environment as a CDN on top of a structured peer-to-peer system of collaborative
web-servers integrated to share the load and to improve the overall
system performance, scalability, availability and robustness. Unlike clusterbased
solutions, it can run on heterogeneous hardware, over geographically
dispersed areas. To validate and evaluate our approach, we have developed a
system prototype called DOH (DKS Organized Hosting) that is a CDN implemented
on top of the DKS (Distributed K-nary Search) structured P2P
system with DHT (Distributed Hash table) functionality [9]. The prototype
is implemented in Java, using the DKS middleware, the Jetty web-server, and
a modi¯ed JavaFTP server. The proposed design of CDN has been evaluated
by simulation and by evaluation experiments on the prototype
The essence of P2P: A reference architecture for overlay networks
The success of the P2P idea has created a huge diversity
of approaches, among which overlay networks, for example,
Gnutella, Kazaa, Chord, Pastry, Tapestry, P-Grid, or DKS,
have received specific attention from both developers and
researchers. A wide variety of algorithms, data structures,
and architectures have been proposed. The terminologies
and abstractions used, however, have become quite inconsistent since the P2P paradigm has attracted people from many different communities, e.g., networking, databases, distributed systems, graph theory, complexity theory, biology, etc. In this paper we propose a reference model for overlay networks which is capable of modeling different approaches in this domain in a generic manner. It is intended to allow researchers and users to assess the properties of concrete systems, to establish a common vocabulary for scientific discussion, to facilitate the qualitative comparison of the systems, and to serve as the basis for defining a standardized API to make overlay networks interoperable
Building a P2P RDF Store for Edge Devices
The Semantic Web technologies have been used in the Internet of Things (IoT)
to facilitate data interoperability and address data heterogeneity issues. The
Resource Description Framework (RDF) model is employed in the integration of
IoT data, with RDF engines serving as gateways for semantic integration.
However, storing and querying RDF data obtained from distributed sources across
a dynamic network of edge devices presents a challenging task. The distributed
nature of the edge shares similarities with Peer-to-Peer (P2P) systems. These
similarities include attributes like node heterogeneity, limited availability,
and resources. The nodes primarily undertake tasks related to data storage and
processing. Therefore, the P2P models appear to present an attractive approach
for constructing distributed RDF stores. Based on P-Grid, a data indexing
mechanism for load balancing and range query processing in P2P systems, this
paper proposes a design for storing and sharing RDF data on P2P networks of
low-cost edge devices. Our design aims to integrate both P-Grid and an
edge-based RDF storage solution, RDF4Led for building an P2P RDF engine. This
integration can maintain RDF data access and query processing while scaling
with increasing data and network size. We demonstrated the scaling behavior of
our implementation on a P2P network, involving up to 16 nodes of Raspberry Pi 4
devices.Comment: Accepted to IoT Conference 202
Dynamic load balancing for the distributed mining of molecular structures
In molecular biology, it is often desirable to find common properties in large numbers of drug candidates. One family of
methods stems from the data mining community, where algorithms to find frequent graphs have received increasing attention over the
past years. However, the computational complexity of the underlying problem and the large amount of data to be explored essentially
render sequential algorithms useless. In this paper, we present a distributed approach to the frequent subgraph mining problem to
discover interesting patterns in molecular compounds. This problem is characterized by a highly irregular search tree, whereby no
reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely, a dynamic
partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiverinitiated
load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer
Institute’s HIV-screening data set, where we were able to show close-to linear speedup in a network of workstations. The proposed
approach also allows for dynamic resource aggregation in a non dedicated computational environment. These features make it suitable
for large-scale, multi-domain, heterogeneous environments, such as computational grids
A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing
Data Grids have been adopted as the platform for scientific communities that
need to share, access, transport, process and manage large data collections
distributed worldwide. They combine high-end computing technologies with
high-performance networking and wide-area storage management techniques. In
this paper, we discuss the key concepts behind Data Grids and compare them with
other data sharing and distribution paradigms such as content delivery
networks, peer-to-peer networks and distributed databases. We then provide
comprehensive taxonomies that cover various aspects of architecture, data
transportation, data replication and resource allocation and scheduling.
Finally, we map the proposed taxonomy to various Data Grid systems not only to
validate the taxonomy but also to identify areas for future exploration.
Through this taxonomy, we aim to categorise existing systems to better
understand their goals and their methodology. This would help evaluate their
applicability for solving similar problems. This taxonomy also provides a "gap
analysis" of this area through which researchers can potentially identify new
issues for investigation. Finally, we hope that the proposed taxonomy and
mapping also helps to provide an easy way for new practitioners to understand
this complex area of research.Comment: 46 pages, 16 figures, Technical Repor
- …