49,359 research outputs found

    Software Issues and Performance of a Parallel Model for Stock Option Pricing

    Get PDF
    The finance industry is beginning to adopt parallel computing for numerical computation, and will soon be in a position to use parallel supercomputers. This paper examines software issues and performance of a stock option pricing model running on the Connection Machine-2 and DECmpp-12000. Pricing models incorporating stochastic volatility with American call (early exercise) are computationally intensive and require substantial communication. Three parallel versions of a stock option pricing model were developed which varied in data distribution, load balancing, and communication. The performance of this set of increasingly refined models ranged over no improvement, 10 times, and 100 times faster than a sequential model. A straightforward approach to this problem involves use of two-dimensional dynamic arrays. When asymmetric arrays are mapped on the DECmpp-12000, distribution of data to physical processors is inefficient and performance suffers. The regular communication patterns in the model can also be expressed in one dimensional arrays, improving data distribution. Performance of this version is similar on both parallel machines. Combining one dimensional parallel and sequential arrays achieves efficient data distribution, reduces interprocessor communication, and further improves performance (100 times faster than a sequential workstation model). The performance improvements possible on parallel supercomputers presents new opportunities for pricing entire portfolios, performing large scale model and market comparisons and using optimization techniques to improve model price estimates

    Sublinearly space bounded iterative arrays

    Get PDF
    Iterative arrays (IAs) are a, parallel computational model with a sequential processing of the input. They are one-dimensional arrays of interacting identical deterministic finite automata. In this note, realtime-lAs with sublinear space bounds are used to accept formal languages. The existence of a proper hierarchy of space complexity classes between logarithmic anel linear space bounds is proved. Furthermore, an optimal spacc lower bound for non-regular language recognition is shown. Key words: Iterative arrays, cellular automata, space bounded computations, decidability questions, formal languages, theory of computatio

    Active data structures on GPGPUs

    Get PDF
    Active data structures support operations that may affect a large number of elements of an aggregate data structure. They are well suited for extremely fine grain parallel systems, including circuit parallelism. General purpose GPUs were designed to support regular graphics algorithms, but their intermediate level of granularity makes them potentially viable also for active data structures. We consider the characteristics of active data structures and discuss the feasibility of implementing them on GPGPUs. We describe the GPU implementations of two such data structures (ESF arrays and index intervals), assess their performance, and discuss the potential of active data structures as an unconventional programming model that can exploit the capabilities of emerging fine grain architectures such as GPUs

    Optimisation of a parallel ocean general circulation model

    Get PDF
    Abstract. This paper presents the development of a general-purpose parallel ocean circulation model, for use on a wide range of computer platforms, from traditional scalar machines to workstation clusters and massively parallel processors. Parallelism is provided, as a modular option, via high-level message-passing rou- tines, thus hiding the technical intricacies from the user. An initial implementation highlights that the parallel e?ciency of the model is adversely a?ected by a number of factors, for which optimisations are discussed and implemented. The resulting ocean code is portable and, in particular, allows science to be achieved on local workstations that could otherwise only be undertaken on state-of-the-art supercomputers

    Managing Communication Latency-Hiding at Runtime for Parallel Programming Languages and Libraries

    Full text link
    This work introduces a runtime model for managing communication with support for latency-hiding. The model enables non-computer science researchers to exploit communication latency-hiding techniques seamlessly. For compiled languages, it is often possible to create efficient schedules for communication, but this is not the case for interpreted languages. By maintaining data dependencies between scheduled operations, it is possible to aggressively initiate communication and lazily evaluate tasks to allow maximal time for the communication to finish before entering a wait state. We implement a heuristic of this model in DistNumPy, an auto-parallelizing version of numerical Python that allows sequential NumPy programs to run on distributed memory architectures. Furthermore, we present performance comparisons for eight benchmarks with and without automatic latency-hiding. The results shows that our model reduces the time spent on waiting for communication as much as 27 times, from a maximum of 54% to only 2% of the total execution time, in a stencil application.Comment: PREPRIN
    • …
    corecore