4 research outputs found

    Evaluation of Design Tools for Rapid Prototyping of Parallel Signal Processing Algorithms

    Get PDF
    Digital signal processing (DSP) has become a popular method for handling not only signal processing, but communications, and control system applications. A DSP application of interest to the Air Force is high speed avionics processing. The real time computing requirements of avionics processing exceed the capabilities of current single chip DSP processors, and parallelization of multiple DSP processors is a solution to handle such requirements. Designing and implementing a parallel DSP algorithm has been a lengthy process often requiring different design tools and extensive programming experience. Through the use of integrated software development tools, rapid prototyping becomes possible by simulating algorithms, generating code for workstations or DSP microprocessors, and generating hardware description language code for hardware synthesis. This research examines the use of one such tool, the Signal Processing WorkSystem (SPW) by the Alta Group of Cadence Design Systems, Inc., and how SPW supports the rapid prototyping process from an avionics algorithm design through simulation and hardware implementation. Throughout this process, SPW is evaluated as an aid to the avionics designer to meet design objectives and evaluate tradeoffs to find the best blend of efficiency and effectiveness. By designing a two dimensional fast Fourier transform algorithm as a specific avionics algorithm and exploring implementation options, SPW is shown to be a viable rapid prototyping solution allowing an avionics designer to focus on design trade-offs instead of implementation details while using parallelization to meet real-time application requirements

    Implementation and Evaluation of Algorithmic Skeletons: Parallelisation of Computer Algebra Algorithms

    Get PDF
    This thesis presents design and implementation approaches for the parallel algorithms of computer algebra. We use algorithmic skeletons and also further approaches, like data parallel arithmetic and actors. We have implemented skeletons for divide and conquer algorithms and some special parallel loops, that we call ‘repeated computation with a possibility of premature termination’. We introduce in this thesis a rational data parallel arithmetic. We focus on parallel symbolic computation algorithms, for these algorithms our arithmetic provides a generic parallelisation approach. The implementation is carried out in Eden, a parallel functional programming language based on Haskell. This choice enables us to encode both the skeletons and the programs in the same language. Moreover, it allows us to refrain from using two different languages—one for the implementation and one for the interface—for our implementation of computer algebra algorithms. Further, this thesis presents methods for evaluation and estimation of parallel execution times. We partition the parallel execution time into two components. One of them accounts for the quality of the parallelisation, we call it the ‘parallel penalty’. The other is the sequential execution time. For the estimation, we predict both components separately, using statistical methods. This enables very confident estimations, although using drastically less measurement points than other methods. We have applied both our evaluation and estimation approaches to the parallel programs presented in this thesis. We haven also used existing estimation methods. We developed divide and conquer skeletons for the implementation of fast parallel multiplication. We have implemented the Karatsuba algorithm, Strassen’s matrix multiplication algorithm and the fast Fourier transform. The latter was used to implement polynomial convolution that leads to a further fast multiplication algorithm. Specially for our implementation of Strassen algorithm we have designed and implemented a divide and conquer skeleton basing on actors. We have implemented the parallel fast Fourier transform, and not only did we use new divide and conquer skeletons, but also developed a map-and-transpose skeleton. It enables good parallelisation of the Fourier transform. The parallelisation of Karatsuba multiplication shows a very good performance. We have analysed the parallel penalty of our programs and compared it to the serial fraction—an approach, known from literature. We also performed execution time estimations of our divide and conquer programs. This thesis presents a parallel map+reduce skeleton scheme. It allows us to combine the usual parallel map skeletons, like parMap, farm, workpool, with a premature termination property. We use this to implement the so-called ‘parallel repeated computation’, a special form of a speculative parallel loop. We have implemented two probabilistic primality tests: the Rabin–Miller test and the Jacobi sum test. We parallelised both with our approach. We analysed the task distribution and stated the fitting configurations of the Jacobi sum test. We have shown formally that the Jacobi sum test can be implemented in parallel. Subsequently, we parallelised it, analysed the load balancing issues, and produced an optimisation. The latter enabled a good implementation, as verified using the parallel penalty. We have also estimated the performance of the tests for further input sizes and numbers of processing elements. Parallelisation of the Jacobi sum test and our generic parallelisation scheme for the repeated computation is our original contribution. The data parallel arithmetic was defined not only for integers, which is already known, but also for rationals. We handled the common factors of the numerator or denominator of the fraction with the modulus in a novel manner. This is required to obtain a true multiple-residue arithmetic, a novel result of our research. Using these mathematical advances, we have parallelised the determinant computation using the Gauß elimination. As always, we have performed task distribution analysis and estimation of the parallel execution time of our implementation. A similar computation in Maple emphasised the potential of our approach. Data parallel arithmetic enables parallelisation of entire classes of computer algebra algorithms. Summarising, this thesis presents and thoroughly evaluates new and existing design decisions for high-level parallelisations of computer algebra algorithms

    Software for Exascale Computing - SPPEXA 2016-2019

    Get PDF
    This open access book summarizes the research done and results obtained in the second funding phase of the Priority Program 1648 "Software for Exascale Computing" (SPPEXA) of the German Research Foundation (DFG) presented at the SPPEXA Symposium in Dresden during October 21-23, 2019. In that respect, it both represents a continuation of Vol. 113 in Springer’s series Lecture Notes in Computational Science and Engineering, the corresponding report of SPPEXA’s first funding phase, and provides an overview of SPPEXA’s contributions towards exascale computing in today's sumpercomputer technology. The individual chapters address one or more of the research directions (1) computational algorithms, (2) system software, (3) application software, (4) data management and exploration, (5) programming, and (6) software tools. The book has an interdisciplinary appeal: scholars from computational sub-fields in computer science, mathematics, physics, or engineering will find it of particular interest

    Autonomic visualisation.

    Get PDF
    This thesis introduces the concept of autonomic visualisation, where principles of autonomic systems are brought to the field of visualisation infrastructure. Problems in visualisation have a specific set of requirements which are not always met by existing systems. The first half of this thesis explores a specific problem for large scale visualisation; that of data management. Visualisation algorithms have somewhat different requirements to other external memory problems, due to the fact that they often require access to all, or a large subset, of the data in a way that is highly dependent on the view. This thesis proposes a knowledge-based approach to pre-fetching in this context, and presents evidence that such an approach yields good performance. The knowledge based approach is incorporated into a five-layer model, which provides a systematic way of categorising and designing out-of-core, or external memory, systems. This model is demonstrated with two example implementations, on in the local and one in the remote context. The second half explores autonomic visualisation in the more general case. A simulation tool, created for the purpose of designing autonomic visualisation infrastructure is presented. This tool, SimEAC, provides a way of facilitating the development of techniques for managing large-scale visualisation systems. The abstract design of the simulation system, as well as details of the implementation are presented. The architecture of the simulator is explored, and then the system is evaluated in a number of case studies indicating some of the ways in which it can be used. The simulator provides a framework for experimentation and rapid prototyping of large scale autonomic systems
    corecore