9,669 research outputs found

    Orthogonal parallel processing in vector pascal

    Get PDF
    Despite the widespread adoption of parallel operations in contemporary CPU designs, their use has been restricted by a lack of appropriate programming language abstractions and development environments. To fully exploit the SIMD model of computation such operations offer, programmers depend on CPU specific machine code or implementation dependent libraries. Vector Pascal is a language designed to enable the elegant and efficient expression of SIMD algorithms. It imports into Pascal abstraction mechanisms derived from functional languages, in turn having their origins in APL. In particular, it extends all operators to work on vectors of data. The type system is also extended to handle pixels and dimensional analysis. Code generation is via the ILCG system that allows retargeting to multiple different SIMD instruction sets based on formalised descriptions of the instruction set semantics

    Compiling vector pascal to the XeonPhi

    Get PDF
    Intel's XeonPhi is a highly parallel x86 architecture chip made by Intel. It has a number of novel features which make it a particularly challenging target for the compiler writer. This paper describes the techniques used to port the Glasgow Vector Pascal Compiler to this architecture and assess its performance by comparisons of the XeonPhi with 3 other machines running the same algorithms

    Array languages and the N-body problem

    Get PDF
    This paper is a description of the contributions to the SICSA multicore challenge on many body planetary simulation made by a compiler group at the University of Glasgow. Our group is part of the Computer Vision and Graphics research group and we have for some years been developing array compilers because we think these are a good tool both for expressing graphics algorithms and for exploiting the parallelism that computer vision applications require. We shall describe experiments using two languages on two different platforms and we shall compare the performance of these with reference C implementations running on the same platforms. Finally we shall draw conclusions both about the viability of the array language approach as compared to other approaches used in the challenge and also about the strengths and weaknesses of the two, very different, processor architectures we used

    A compiler extension for parallelizing arrays automatically on the cell heterogeneous processor

    Get PDF
    This paper describes the approaches taken to extend an array programming language compiler using a Virtual SIMD Machine (VSM) model for parallelizing array operations on Cell Broadband Engine heterogeneous machine. This development is part of ongoing work at the University of Glasgow for developing array compilers that are beneficial for applications in many areas such as graphics, multimedia, image processing and scientific computation. Our extended compiler, which is built upon the VSM interface, eases the parallelization processes by allowing automatic parallelisation without the need for any annotations or process directives. The preliminary results demonstrate significant improvement especially on data-intensive applications

    Acceleration of stereo-matching on multi-core CPU and GPU

    Get PDF
    This paper presents an accelerated version of a dense stereo-correspondence algorithm for two different parallelism enabled architectures, multi-core CPU and GPU. The algorithm is part of the vision system developed for a binocular robot-head in the context of the CloPeMa 1 research project. This research project focuses on the conception of a new clothes folding robot with real-time and high resolution requirements for the vision system. The performance analysis shows that the parallelised stereo-matching algorithm has been significantly accelerated, maintaining 12x and 176x speed-up respectively for multi-core CPU and GPU, compared with non-SIMD singlethread CPU. To analyse the origin of the speed-up and gain deeper understanding about the choice of the optimal hardware, the algorithm was broken into key sub-tasks and the performance was tested for four different hardware architectures

    Parallel stereo vision algorithm

    Get PDF
    Integrating a stereo-photogrammetric robot head into a real-time system requires software solutions that rapidly resolve the stereo correspondence problem. The stereo-matcher presented in this paper uses therefore code parallelisation and was tested on three different processors with x87 and AVX. The results show that a 5mega pixels colour image can be matched in 5,55 seconds or as monochrome in 3,3 seconds

    Developing a compiler for the XeonPhi (TR-2014-341)

    Get PDF
    The XeonPhi is a highly parallel x86 architecture chip made by Intel. It has a number of novel features which make it a particularly challenging target for the compiler writer. This paper describes the techniques used to port the Glasgow Vector Pascal Compiler (VPC) to this architecture and assess its performance by comparisons of the XeonPhi with 3 other machines running the same algorithms
    corecore