9,669 research outputs found
Orthogonal parallel processing in vector pascal
Despite the widespread adoption of parallel operations in contemporary CPU designs, their use has been restricted by a lack of appropriate programming language abstractions and development environments. To fully exploit the SIMD model of computation such operations offer, programmers depend on CPU specific machine code or implementation dependent libraries. Vector Pascal is a language designed to enable the elegant and efficient expression of SIMD algorithms. It imports into Pascal abstraction mechanisms derived from functional languages, in turn having their origins in APL. In particular, it extends all operators to work on vectors of data. The type system is also extended to handle pixels and dimensional analysis. Code generation is via the ILCG system that allows retargeting to multiple different SIMD instruction sets based on formalised descriptions of the instruction set semantics
Compiling vector pascal to the XeonPhi
Intel's XeonPhi is a highly parallel x86 architecture chip made by Intel. It has a number of novel features which make it a particularly challenging target for the compiler writer. This paper describes the techniques used to port the Glasgow Vector Pascal Compiler to this architecture and assess its performance by comparisons of the XeonPhi with 3 other machines running the same algorithms
Array languages and the N-body problem
This paper is a description of the contributions to the SICSA multicore challenge on many body
planetary simulation made by a compiler group at the University of Glasgow. Our group is part of
the Computer Vision and Graphics research group and we have for some years been developing array
compilers because we think these are a good tool both for expressing graphics algorithms and for
exploiting the parallelism that computer vision applications require.
We shall describe experiments using two languages on two different platforms and we shall compare
the performance of these with reference C implementations running on the same platforms. Finally
we shall draw conclusions both about the viability of the array language approach as compared to
other approaches used in the challenge and also about the strengths and weaknesses of the two, very
different, processor architectures we used
A compiler extension for parallelizing arrays automatically on the cell heterogeneous processor
This paper describes the approaches taken to extend an array
programming language compiler using a Virtual SIMD Machine (VSM)
model for parallelizing array operations on Cell Broadband Engine heterogeneous
machine. This development is part of ongoing work at the
University of Glasgow for developing array compilers that are beneficial
for applications in many areas such as graphics, multimedia, image processing
and scientific computation. Our extended compiler, which is built
upon the VSM interface, eases the parallelization processes by allowing
automatic parallelisation without the need for any annotations or process
directives. The preliminary results demonstrate significant improvement
especially on data-intensive applications
Acceleration of stereo-matching on multi-core CPU and GPU
This paper presents an accelerated version of a
dense stereo-correspondence algorithm for two different parallelism
enabled architectures, multi-core CPU and GPU. The
algorithm is part of the vision system developed for a binocular
robot-head in the context of the CloPeMa 1 research project.
This research project focuses on the conception of a new clothes
folding robot with real-time and high resolution requirements
for the vision system. The performance analysis shows that
the parallelised stereo-matching algorithm has been significantly
accelerated, maintaining 12x and 176x speed-up respectively
for multi-core CPU and GPU, compared with non-SIMD singlethread
CPU. To analyse the origin of the speed-up and gain
deeper understanding about the choice of the optimal hardware,
the algorithm was broken into key sub-tasks and the performance
was tested for four different hardware architectures
Parallel stereo vision algorithm
Integrating a stereo-photogrammetric robot
head into a real-time system requires software
solutions that rapidly resolve the stereo correspondence
problem. The stereo-matcher presented in this
paper uses therefore code parallelisation and was
tested on three different processors with x87 and AVX.
The results show that a 5mega pixels colour image can
be matched in 5,55 seconds or as monochrome in 3,3
seconds
Developing a compiler for the XeonPhi (TR-2014-341)
The XeonPhi is a highly parallel x86 architecture
chip made by Intel. It has a number of novel features which make it
a particularly challenging target for the compiler writer. This paper
describes the techniques used to port the Glasgow Vector Pascal Compiler (VPC)
to this architecture and assess its performance by comparisons of the XeonPhi with
3 other machines running the same algorithms
- …