5,086 research outputs found
Lock-free Concurrent Data Structures
Concurrent data structures are the data sharing side of parallel programming.
Data structures give the means to the program to store data, but also provide
operations to the program to access and manipulate these data. These operations
are implemented through algorithms that have to be efficient. In the sequential
setting, data structures are crucially important for the performance of the
respective computation. In the parallel programming setting, their importance
becomes more crucial because of the increased use of data and resource sharing
for utilizing parallelism.
The first and main goal of this chapter is to provide a sufficient background
and intuition to help the interested reader to navigate in the complex research
area of lock-free data structures. The second goal is to offer the programmer
familiarity to the subject that will allow her to use truly concurrent methods.Comment: To appear in "Programming Multi-core and Many-core Computing
Systems", eds. S. Pllana and F. Xhafa, Wiley Series on Parallel and
Distributed Computin
Evaluating Cache Coherent Shared Virtual Memory for Heterogeneous Multicore Chips
The trend in industry is towards heterogeneous multicore processors (HMCs),
including chips with CPUs and massively-threaded throughput-oriented processors
(MTTOPs) such as GPUs. Although current homogeneous chips tightly couple the
cores with cache-coherent shared virtual memory (CCSVM), this is not the
communication paradigm used by any current HMC. In this paper, we present a
CCSVM design for a CPU/MTTOP chip, as well as an extension of the pthreads
programming model, called xthreads, for programming this HMC. Our goal is to
evaluate the potential performance benefits of tightly coupling heterogeneous
cores with CCSVM
Building-in quality rather than assessing quality afterwards: a technological solution to ensuring computational accuracy in learning materials
[Abstract]: Quality encompasses a very broad range of ideas in learning
materials, yet the accuracy of the content is often overlooked
as a measure of quality. Various aspects of accuracy are briefly
considered, and the issue of computational accuracy is then
considered further. When learning materials are produced
containing the results of mathematical computations, accuracy
is essential: but how can the results of these computations
be known to be correct? A solution is to embed the instructions
for performing the calculations in the materials, and let
the computer calculate the result and place it in the text. In
this way, quality is built into the learning materials by design,
not evaluated after the event. This is all accomplished using
the ideas of literate programming, applied to the learning materials
context. A small example demonstrates how remarkably
easy the ideas are to apply in practice using the appropriate
technology. Given that the technology is available and
is easy to use, it would appear imperative that the approach
discussed is adopted to improve quality in learning materials
containing computational results
Mixing multi-core CPUs and GPUs for scientific simulation software
Recent technological and economic developments have led to widespread availability of
multi-core CPUs and specialist accelerator processors such as graphical processing units
(GPUs). The accelerated computational performance possible from these devices can be very
high for some applications paradigms. Software languages and systems such as NVIDIA's
CUDA and Khronos consortium's open compute language (OpenCL) support a number of
individual parallel application programming paradigms. To scale up the performance of some
complex systems simulations, a hybrid of multi-core CPUs for coarse-grained parallelism and
very many core GPUs for data parallelism is necessary. We describe our use of hybrid applica-
tions using threading approaches and multi-core CPUs to control independent GPU devices.
We present speed-up data and discuss multi-threading software issues for the applications
level programmer and o er some suggested areas for language development and integration
between coarse-grained and ne-grained multi-thread systems. We discuss results from three
common simulation algorithmic areas including: partial di erential equations; graph cluster
metric calculations and random number generation. We report on programming experiences
and selected performance for these algorithms on: single and multiple GPUs; multi-core CPUs;
a CellBE; and using OpenCL. We discuss programmer usability issues and the outlook and
trends in multi-core programming for scienti c applications developers
Fast processing of grid maps using graphical multiprocessors
Grid mapping is a very common technique used in mobile robotics to build a continuous 2D representation of the environment useful for navigation purposes. Although its computation is quite simple and fast, this algorithm uses the hypothesis of a known robot pose. In practice, this can require the re-computation of the map when the estimated robot poses change, as when a loop closure is detected. This paper presents a parallelization of a reference implementation of the grid mapping algorithm, which is suitable to be fully run on a graphics card showing huge processing speedups (up to 50×) while fully releasing the main processor, which can be very useful for many Simultaneous Localization and Mapping algorithms
Remote access for NAS: Supercomputing in a university environment
The experiment was designed to assist the Numerical Aerodynamic Simulation (NAS) Project Office in the testing and evaluation of long haul communications for remote users. The objectives of this work were to: (1) use foreign workstations to remotely access the NAS system; (2) provide NAS with a link to a large university-based computing facility which can serve as a model for a regional node of the Long-Haul Communications Subsystem (LHCS); and (3) provide a tail circuit to the University of Colorado a Boulder thereby simulating the complete communications path from NAS through a regional node to an end-user
- …