2,267 research outputs found
High performance subgraph mining in molecular compounds
Structured data represented in the form of graphs arises in
several fields of the science and the growing amount of available data makes distributed graph mining techniques particularly relevant. In this paper, we present a distributed approach to the frequent subgraph mining
problem to discover interesting patterns in molecular compounds. The problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main
aspects of the proposed distributed algorithm, namely a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiver-initiated, load balancing
algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute’s HIV-screening dataset, where the approach attains close-to linear speedup in a network
of workstations
Pando: Personal Volunteer Computing in Browsers
The large penetration and continued growth in ownership of personal
electronic devices represents a freely available and largely untapped source of
computing power. To leverage those, we present Pando, a new volunteer computing
tool based on a declarative concurrent programming model and implemented using
JavaScript, WebRTC, and WebSockets. This tool enables a dynamically varying
number of failure-prone personal devices contributed by volunteers to
parallelize the application of a function on a stream of values, by using the
devices' browsers. We show that Pando can provide throughput improvements
compared to a single personal device, on a variety of compute-bound
applications including animation rendering and image processing. We also show
the flexibility of our approach by deploying Pando on personal devices
connected over a local network, on Grid5000, a French-wide computing grid in a
virtual private network, and seven PlanetLab nodes distributed in a wide area
network over Europe.Comment: 14 pages, 12 figures, 2 table
Storage and Search in Dynamic Peer-to-Peer Networks
We study robust and efficient distributed algorithms for searching, storing,
and maintaining data in dynamic Peer-to-Peer (P2P) networks. P2P networks are
highly dynamic networks that experience heavy node churn (i.e., nodes join and
leave the network continuously over time). Our goal is to guarantee, despite
high node churn rate, that a large number of nodes in the network can store,
retrieve, and maintain a large number of data items. Our main contributions are
fast randomized distributed algorithms that guarantee the above with high
probability (whp) even under high adversarial churn:
1. A randomized distributed search algorithm that (whp) guarantees that
searches from as many as nodes ( is the stable network size)
succeed in -rounds despite churn, for
any small constant , per round. We assume that the churn is
controlled by an oblivious adversary (that has complete knowledge and control
of what nodes join and leave and at what time, but is oblivious to the random
choices made by the algorithm).
2. A storage and maintenance algorithm that guarantees (whp) data items can
be efficiently stored (with only copies of each data item)
and maintained in a dynamic P2P network with churn rate up to
per round. Our search algorithm together with our
storage and maintenance algorithm guarantees that as many as nodes
can efficiently store, maintain, and search even under churn per round. Our algorithms require only polylogarithmic in bits to
be processed and sent (per round) by each node.
To the best of our knowledge, our algorithms are the first-known,
fully-distributed storage and search algorithms that provably work under highly
dynamic settings (i.e., high churn rates per step).Comment: to appear at SPAA 201
- …