290,954 research outputs found
Efficient and Flexible Search in Large Scale Distributed Systems
Peer-to-peer (P2P) technology has triggered a wide range of
distributed systems beyond simple file-sharing. Distributed XML
databases, distributed computing, server-less web publishing and
networked resource/service sharing are only a few to name. Despite
of the diversity in applications, these systems share a common
problem regarding searching and discovery of information. This
commonality stems from the transitory nodes population and
volatile information content in the participating nodes. In such
dynamic environment, users are not expected to have the exact
information about the available objects in the system. Rather
queries are based on partial information, which requires the
search mechanism to be flexible. On the other hand, to scale with
network size the search mechanism is required to be bandwidth
efficient.
Since the advent of P2P technology experts from industry and
academia have proposed a number of search techniques - none of
which is able to provide satisfactory solution to the conflicting
requirements of search efficiency and flexibility. Structured
search techniques, mostly Distributed Hash Table (DHT)-based, are
bandwidth efficient while semi(un)-structured techniques are
flexible. But, neither achieves both ends.
This thesis defines the Distributed Pattern Matching (DPM)
problem. The DPM problem is to discover a pattern (\ie bit-vector)
using any subset of its 1-bits, under the assumption that the
patterns are distributed across a large population of networked
nodes. Search problem in many distributed systems can be reduced
to the DPM problem.
This thesis also presents two distinct search mechanisms, named
Distributed Pattern Matching System (DPMS) and Plexus, for solving
the DPM problem. DPMS is a semi-structured, hierarchical
architecture aiming to discover a predefined number of matches by
visiting a small number of nodes. Plexus, on the other hand, is a
structured search mechanism based on the theory of Error
Correcting Code (ECC). The design goal behind Plexus is to
discover all the matches by visiting a reasonable number of nodes
Gunrock: GPU Graph Analytics
For large-scale graph analytics on the GPU, the irregularity of data access
and control flow, and the complexity of programming GPUs, have presented two
significant challenges to developing a programmable high-performance graph
library. "Gunrock", our graph-processing system designed specifically for the
GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on
operations on a vertex or edge frontier. Gunrock achieves a balance between
performance and expressiveness by coupling high performance GPU computing
primitives and optimization strategies with a high-level programming model that
allows programmers to quickly develop new graph primitives with small code size
and minimal GPU programming knowledge. We characterize the performance of
various optimization strategies and evaluate Gunrock's overall performance on
different GPU architectures on a wide range of graph primitives that span from
traversal-based algorithms and ranking algorithms, to triangle counting and
bipartite-graph-based algorithms. The results show that on a single GPU,
Gunrock has on average at least an order of magnitude speedup over Boost and
PowerGraph, comparable performance to the fastest GPU hardwired primitives and
CPU shared-memory graph libraries such as Ligra and Galois, and better
performance than any other GPU high-level graph library.Comment: 52 pages, invited paper to ACM Transactions on Parallel Computing
(TOPC), an extended version of PPoPP'16 paper "Gunrock: A High-Performance
Graph Processing Library on the GPU
Gunrock: A High-Performance Graph Processing Library on the GPU
For large-scale graph analytics on the GPU, the irregularity of data access
and control flow, and the complexity of programming GPUs have been two
significant challenges for developing a programmable high-performance graph
library. "Gunrock", our graph-processing system designed specifically for the
GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on
operations on a vertex or edge frontier. Gunrock achieves a balance between
performance and expressiveness by coupling high performance GPU computing
primitives and optimization strategies with a high-level programming model that
allows programmers to quickly develop new graph primitives with small code size
and minimal GPU programming knowledge. We evaluate Gunrock on five key graph
primitives and show that Gunrock has on average at least an order of magnitude
speedup over Boost and PowerGraph, comparable performance to the fastest GPU
hardwired primitives, and better performance than any other GPU high-level
graph library.Comment: 14 pages, accepted by PPoPP'16 (removed the text repetition in the
previous version v5
- …