70,730 research outputs found
DART-MPI: An MPI-based Implementation of a PGAS Runtime System
A Partitioned Global Address Space (PGAS) approach treats a distributed
system as if the memory were shared on a global level. Given such a global view
on memory, the user may program applications very much like shared memory
systems. This greatly simplifies the tasks of developing parallel applications,
because no explicit communication has to be specified in the program for data
exchange between different computing nodes. In this paper we present DART, a
runtime environment, which implements the PGAS paradigm on large-scale
high-performance computing clusters. A specific feature of our implementation
is the use of one-sided communication of the Message Passing Interface (MPI)
version 3 (i.e. MPI-3) as the underlying communication substrate. We evaluated
the performance of the implementation with several low-level kernels in order
to determine overheads and limitations in comparison to the underlying MPI-3.Comment: 11 pages, International Conference on Partitioned Global Address
Space Programming Models (PGAS14
An OpenSHMEM Implementation for the Adapteva Epiphany Coprocessor
This paper reports the implementation and performance evaluation of the
OpenSHMEM 1.3 specification for the Adapteva Epiphany architecture within the
Parallella single-board computer. The Epiphany architecture exhibits massive
many-core scalability with a physically compact 2D array of RISC CPU cores and
a fast network-on-chip (NoC). While fully capable of MPMD execution, the
physical topology and memory-mapped capabilities of the core and network
translate well to Partitioned Global Address Space (PGAS) programming models
and SPMD execution with SHMEM.Comment: 14 pages, 9 figures, OpenSHMEM 2016: Third workshop on OpenSHMEM and
Related Technologie
Exploring Scientific Application Performance Using Large Scale Object Storage
One of the major performance and scalability bottlenecks in large scientific
applications is parallel reading and writing to supercomputer I/O systems. The
usage of parallel file systems and consistency requirements of POSIX, that all
the traditional HPC parallel I/O interfaces adhere to, pose limitations to the
scalability of scientific applications. Object storage is a widely used storage
technology in cloud computing and is more frequently proposed for HPC workload
to address and improve the current scalability and performance of I/O in
scientific applications. While object storage is a promising technology, it is
still unclear how scientific applications will use object storage and what the
main performance benefits will be. This work addresses these questions, by
emulating an object storage used by a traditional scientific application and
evaluating potential performance benefits. We show that scientific applications
can benefit from the usage of object storage on large scales.Comment: Preprint submitted to WOPSSS workshop at ISC 201
Free and Open Source Software in Municipal Procurement:The Challenges and Benefits of Cooperation
The use of free and open source software by municipal governments is the exception rather than the rule. This is due to a variety of factors, including a failure of many municipal procurement policies to take into account the benefits of free software, free software vendors second-to-market status, and a lack of established free and open source software vendors in niche markets. With feasible policy shifts to improve city operations, including building upon open standards and engaging with free software communities, municipalities may be able to better leverage free and open source software to realize fully the advantages that stem from open software development
Memory-built-in quantum teleportation with photonic and atomic qubits
The combination of quantum teleportation and quantum memory of photonic
qubits is essential for future implementations of large-scale quantum
communication and measurement-based quantum computation. Both steps have been
achieved separately in many proof-of-principle experiments, but the
demonstration of memory-built-in teleportation of photonic qubits remains an
experimental challenge. Here, we demonstrate teleportation between photonic
(flying) and atomic (stationary) qubits. In our experiment, an unknown
polarization state of a single photon is teleported over 7 m onto a remote
atomic qubit that also serves as a quantum memory. The teleported state can be
stored and successfully read out for up to 8 micro-second. Besides being of
fundamental interest, teleportation between photonic and atomic qubits with the
direct inclusion of a readable quantum memory represents a step towards an
efficient and scalable quantum network.Comment: 19 pages 3 figures 1 tabl
SKIRT: hybrid parallelization of radiative transfer simulations
We describe the design, implementation and performance of the new hybrid
parallelization scheme in our Monte Carlo radiative transfer code SKIRT, which
has been used extensively for modeling the continuum radiation of dusty
astrophysical systems including late-type galaxies and dusty tori. The hybrid
scheme combines distributed memory parallelization, using the standard Message
Passing Interface (MPI) to communicate between processes, and shared memory
parallelization, providing multiple execution threads within each process to
avoid duplication of data structures. The synchronization between multiple
threads is accomplished through atomic operations without high-level locking
(also called lock-free programming). This improves the scaling behavior of the
code and substantially simplifies the implementation of the hybrid scheme. The
result is an extremely flexible solution that adjusts to the number of
available nodes, processors and memory, and consequently performs well on a
wide variety of computing architectures.Comment: 21 pages, 20 figure
Design and Evaluation of a Collective IO Model for Loosely Coupled Petascale Programming
Loosely coupled programming is a powerful paradigm for rapidly creating
higher-level applications from scientific programs on petascale systems,
typically using scripting languages. This paradigm is a form of many-task
computing (MTC) which focuses on the passing of data between programs as
ordinary files rather than messages. While it has the significant benefits of
decoupling producer and consumer and allowing existing application programs to
be executed in parallel with no recoding, its typical implementation using
shared file systems places a high performance burden on the overall system and
on the user who will analyze and consume the downstream data. Previous efforts
have achieved great speedups with loosely coupled programs, but have done so
with careful manual tuning of all shared file system access. In this work, we
evaluate a prototype collective IO model for file-based MTC. The model enables
efficient and easy distribution of input data files to computing nodes and
gathering of output results from them. It eliminates the need for such manual
tuning and makes the programming of large-scale clusters using a loosely
coupled model easier. Our approach, inspired by in-memory approaches to
collective operations for parallel programming, builds on fast local file
systems to provide high-speed local file caches for parallel scripts, uses a
broadcast approach to handle distribution of common input data, and uses
efficient scatter/gather and caching techniques for input and output. We
describe the design of the prototype model, its implementation on the Blue
Gene/P supercomputer, and present preliminary measurements of its performance
on synthetic benchmarks and on a large-scale molecular dynamics application.Comment: IEEE Many-Task Computing on Grids and Supercomputers (MTAGS08) 200
An Overview of Collective Bargaining in the United States
[Excerpt] American history reflects a long cycle of trade union decline and growth. Analysts routinely predict the death of the labor movement. (Yeselson 2012). Heralds of laborâs demise often argue that unions were needed in the past, but modem, enlightened management and the need for economic competitiveness make them obsolete. (Troy 1999). But then, workers fed up with employersâ exploitation decide to find new ways to defend themselves.
History does not repeat itself, and conditions now are not the same as those spurring the great organizing drives of the 1930s and â40s. Still, American workers have shown deep resourcefulness over long cycles of trade union growth, decline and regeneration. Workersâ need for âsomebody to back me upâ in the face of employer power never disappears. The labor movement built by workers in the United States over the past century is still a strong base for working class advances and strengthening of collective bargaining in years to come
- âŠ