31,419 research outputs found
An Efficient Multiway Mergesort for GPU Architectures
Sorting is a primitive operation that is a building block for countless
algorithms. As such, it is important to design sorting algorithms that approach
peak performance on a range of hardware architectures. Graphics Processing
Units (GPUs) are particularly attractive architectures as they provides massive
parallelism and computing power. However, the intricacies of their compute and
memory hierarchies make designing GPU-efficient algorithms challenging. In this
work we present GPU Multiway Mergesort (MMS), a new GPU-efficient multiway
mergesort algorithm. MMS employs a new partitioning technique that exposes the
parallelism needed by modern GPU architectures. To the best of our knowledge,
MMS is the first sorting algorithm for the GPU that is asymptotically optimal
in terms of global memory accesses and that is completely free of shared memory
bank conflicts.
We realize an initial implementation of MMS, evaluate its performance on
three modern GPU architectures, and compare it to competitive implementations
available in state-of-the-art GPU libraries. Despite these implementations
being highly optimized, MMS compares favorably, achieving performance
improvements for most random inputs. Furthermore, unlike MMS, state-of-the-art
algorithms are susceptible to bank conflicts. We find that for certain inputs
that cause these algorithms to incur large numbers of bank conflicts, MMS can
achieve up to a 37.6% speedup over its fastest competitor. Overall, even though
its current implementation is not fully optimized, due to its efficient use of
the memory hierarchy, MMS outperforms the fastest comparison-based sorting
implementations available to date
Energy Saving Techniques for Phase Change Memory (PCM)
In recent years, the energy consumption of computing systems has increased
and a large fraction of this energy is consumed in main memory. Towards this,
researchers have proposed use of non-volatile memory, such as phase change
memory (PCM), which has low read latency and power; and nearly zero leakage
power. However, the write latency and power of PCM are very high and this,
along with limited write endurance of PCM present significant challenges in
enabling wide-spread adoption of PCM. To address this, several
architecture-level techniques have been proposed. In this report, we review
several techniques to manage power consumption of PCM. We also classify these
techniques based on their characteristics to provide insights into them. The
aim of this work is encourage researchers to propose even better techniques for
improving energy efficiency of PCM based main memory.Comment: Survey, phase change RAM (PCRAM
A review of data visualization: opportunities in manufacturing sequence management.
Data visualization now benefits from developments in technologies that offer innovative ways of presenting complex data. Potentially these have widespread application in communicating the complex information domains typical of manufacturing sequence management environments for global enterprises. In this paper the authors review the visualization functionalities, techniques and applications reported in literature, map these to manufacturing sequence information presentation requirements and identify the opportunities available and likely development paths. Current leading-edge practice in dynamic updating and communication with suppliers is not being exploited in manufacturing sequence management; it could provide significant benefits to manufacturing business. In the context of global manufacturing operations and broad-based user communities with differing needs served by common data sets, tool functionality is generally ahead of user application
- …