161,118 research outputs found
A Novel Hybrid Quicksort Algorithm Vectorized using AVX-512 on Intel Skylake
The modern CPU's design, which is composed of hierarchical memory and
SIMD/vectorization capability, governs the potential for algorithms to be
transformed into efficient implementations. The release of the AVX-512 changed
things radically, and motivated us to search for an efficient sorting algorithm
that can take advantage of it. In this paper, we describe the best strategy we
have found, which is a novel two parts hybrid sort, based on the well-known
Quicksort algorithm. The central partitioning operation is performed by a new
algorithm, and small partitions/arrays are sorted using a branch-free
Bitonic-based sort. This study is also an illustration of how classical
algorithms can be adapted and enhanced by the AVX-512 extension. We evaluate
the performance of our approach on a modern Intel Xeon Skylake and assess the
different layers of our implementation by sorting/partitioning integers, double
floating-point numbers, and key/value pairs of integers. Our results
demonstrate that our approach is faster than two libraries of reference: the
GNU \emph{C++} sort algorithm by a speedup factor of 4, and the Intel IPP
library by a speedup factor of 1.4.Comment: 8 pages, research pape
From Concept to Reality to Vision
I take a brief look at three frontiers of high-energy physics, illustrating
how important parts of our current thinking evolved from earlier explorations
at preceding frontiers.Comment: 7 pages; Speech in acceptance of EPS prize for high energy physics,
Aachen, August 200
On the average running time of odd-even merge sort
This paper is concerned with the average running time of Batcher's odd-even merge sort when implemented on a collection of processors. We consider the case where , the size of the input, is an arbitrary multiple of the number of processors used. We show that Batcher's odd-even merge (for two sorted lists of length each) can be implemented to run in time on the average, and that odd-even merge sort can be implemented to run in time on the average. In the case of merging (sorting), the average is taken over all possible outcomes of the merging (all possible permutations of elements). That means that odd-even merge and odd-even merge sort have an optimal average running time if . The constants involved are also quite small
Yang-Mills Theory In, Beyond, and Behind Observed Reality
The character of jets is dominated by the influence of intrinsically
nonabelian gauge dynamics. These proven insights into fundamental physics
ramify in many directions, and are far from being exhausted. I will discuss
three rewarding explorations from my own experience, whose point of departure
is the hard Yang-Mills interaction, and whose end is not yet in sight. Given an
insight so profound and fruitful as Yang and Mills brought us, it is in order
to try to consider its broadest implications, which I attempt at the end.Comment: Solicited contribution to the volume ``Fifty Years of Yang-Mills
Theory'' (WorldScientific). 12 p
- …