5,644 research outputs found
A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing
Data Grids have been adopted as the platform for scientific communities that
need to share, access, transport, process and manage large data collections
distributed worldwide. They combine high-end computing technologies with
high-performance networking and wide-area storage management techniques. In
this paper, we discuss the key concepts behind Data Grids and compare them with
other data sharing and distribution paradigms such as content delivery
networks, peer-to-peer networks and distributed databases. We then provide
comprehensive taxonomies that cover various aspects of architecture, data
transportation, data replication and resource allocation and scheduling.
Finally, we map the proposed taxonomy to various Data Grid systems not only to
validate the taxonomy but also to identify areas for future exploration.
Through this taxonomy, we aim to categorise existing systems to better
understand their goals and their methodology. This would help evaluate their
applicability for solving similar problems. This taxonomy also provides a "gap
analysis" of this area through which researchers can potentially identify new
issues for investigation. Finally, we hope that the proposed taxonomy and
mapping also helps to provide an easy way for new practitioners to understand
this complex area of research.Comment: 46 pages, 16 figures, Technical Repor
A Multilevel Approach to Topology-Aware Collective Operations in Computational Grids
The efficient implementation of collective communiction operations has
received much attention. Initial efforts produced "optimal" trees based on
network communication models that assumed equal point-to-point latencies
between any two processes. This assumption is violated in most practical
settings, however, particularly in heterogeneous systems such as clusters of
SMPs and wide-area "computational Grids," with the result that collective
operations perform suboptimally. In response, more recent work has focused on
creating topology-aware trees for collective operations that minimize
communication across slower channels (e.g., a wide-area network). While these
efforts have significant communication benefits, they all limit their view of
the network to only two layers. We present a strategy based upon a multilayer
view of the network. By creating multilevel topology-aware trees we take
advantage of communication cost differences at every level in the network. We
used this strategy to implement topology-aware versions of several MPI
collective operations in MPICH-G2, the Globus Toolkit[tm]-enabled version of
the popular MPICH implementation of the MPI standard. Using information about
topology provided by MPICH-G2, we construct these multilevel topology-aware
trees automatically during execution. We present results demonstrating the
advantages of our multilevel approach by comparing it to the default
(topology-unaware) implementation provided by MPICH and a topology-aware
two-layer implementation.Comment: 16 pages, 8 figure
Embedding of Virtual Network Requests over Static Wireless Multihop Networks
Network virtualization is a technology of running multiple heterogeneous
network architecture on a shared substrate network. One of the crucial
components in network virtualization is virtual network embedding, which
provides a way to allocate physical network resources (CPU and link bandwidth)
to virtual network requests. Despite significant research efforts on virtual
network embedding in wired and cellular networks, little attention has been
paid to that in wireless multi-hop networks, which is becoming more important
due to its rapid growth and the need to share these networks among different
business sectors and users. In this paper, we first study the root causes of
new challenges of virtual network embedding in wireless multi-hop networks, and
propose a new embedding algorithm that efficiently uses the resources of the
physical substrate network. We examine our algorithm's performance through
extensive simulations under various scenarios. Due to lack of competitive
algorithms, we compare the proposed algorithm to five other algorithms, mainly
borrowed from wired embedding or artificially made by us, partially with or
without the key algorithmic ideas to assess their impacts.Comment: 22 page
QR Factorization of Tall and Skinny Matrices in a Grid Computing Environment
Previous studies have reported that common dense linear algebra operations do
not achieve speed up by using multiple geographical sites of a computational
grid. Because such operations are the building blocks of most scientific
applications, conventional supercomputers are still strongly predominant in
high-performance computing and the use of grids for speeding up large-scale
scientific problems is limited to applications exhibiting parallelism at a
higher level. We have identified two performance bottlenecks in the distributed
memory algorithms implemented in ScaLAPACK, a state-of-the-art dense linear
algebra library. First, because ScaLAPACK assumes a homogeneous communication
network, the implementations of ScaLAPACK algorithms lack locality in their
communication pattern. Second, the number of messages sent in the ScaLAPACK
algorithms is significantly greater than other algorithms that trade flops for
communication. In this paper, we present a new approach for computing a QR
factorization -- one of the main dense linear algebra kernels -- of tall and
skinny matrices in a grid computing environment that overcomes these two
bottlenecks. Our contribution is to articulate a recently proposed algorithm
(Communication Avoiding QR) with a topology-aware middleware (QCG-OMPI) in
order to confine intensive communications (ScaLAPACK calls) within the
different geographical sites. An experimental study conducted on the Grid'5000
platform shows that the resulting performance increases linearly with the
number of geographical sites on large-scale problems (and is in particular
consistently higher than ScaLAPACK's).Comment: Accepted at IPDPS10. (IEEE International Parallel & Distributed
Processing Symposium 2010 in Atlanta, GA, USA.
Multi-Path Alpha-Fair Resource Allocation at Scale in Distributed Software Defined Networks
The performance of computer networks relies on how bandwidth is shared among
different flows. Fair resource allocation is a challenging problem particularly
when the flows evolve over time. To address this issue, bandwidth sharing
techniques that quickly react to the traffic fluctuations are of interest,
especially in large scale settings with hundreds of nodes and thousands of
flows. In this context, we propose a distributed algorithm based on the
Alternating Direction Method of Multipliers (ADMM) that tackles the multi-path
fair resource allocation problem in a distributed SDN control architecture. Our
ADMM-based algorithm continuously generates a sequence of resource allocation
solutions converging to the fair allocation while always remaining feasible, a
property that standard primal-dual decomposition methods often lack. Thanks to
the distribution of all computer intensive operations, we demonstrate that we
can handle large instances at scale
- …