2,187 research outputs found

    High-speed detection of emergent market clustering via an unsupervised parallel genetic algorithm

    Full text link
    We implement a master-slave parallel genetic algorithm (PGA) with a bespoke log-likelihood fitness function to identify emergent clusters within price evolutions. We use graphics processing units (GPUs) to implement a PGA and visualise the results using disjoint minimal spanning trees (MSTs). We demonstrate that our GPU PGA, implemented on a commercially available general purpose GPU, is able to recover stock clusters in sub-second speed, based on a subset of stocks in the South African market. This represents a pragmatic choice for low-cost, scalable parallel computing and is significantly faster than a prototype serial implementation in an optimised C-based fourth-generation programming language, although the results are not directly comparable due to compiler differences. Combined with fast online intraday correlation matrix estimation from high frequency data for cluster identification, the proposed implementation offers cost-effective, near-real-time risk assessment for financial practitioners.Comment: 10 pages, 5 figures, 4 tables, More thorough discussion of implementatio

    Analysis and Implementation of Room Assignment Problem and Cannon\u27s Algorithm on General Purpose Programmable Graphical Processing Units with CUDA

    Get PDF
    General-purpose Graphics Processing Units (GP-GPU) has emerged as a popular computing paradigm for high-performance computing over the last few years. The increased interest in GP-GPUs for parallel computing mirrors the trend in general computing with the rise of multi-core processors as an alternative approach to increase processor performance. Many applications that were previously accelerated on distributed processing platforms with MPI or multithreaded techniques such as OpenMP are now being investigated to assess their performance on GP-GPU platforms. Since the GP-GPU platform is designed to give higher performance for parallel problems, applications on other parallel architectures are good candidates for performance studies on GP-GPUs. The first case study in this research is a GP-GPU implementation of a Simulated Annealing-based solution of the Room Assignment problem using CUDA. The Room Assignment problem attempts to arrange N people in N/2 rooms, taking into consideration each person\u27s preference for a roommate. To evaluate the implementation, it was compared against the serial implementation for problem sizes 5000, 10000, 15000 and 20000 people. The GP-GPU implementation achieved as much as 78% higher improvement ratio than the serial version in comparable execution time. The second case study is a GP-GPU implementation of Cannon\u27s Algorithm using CUDA. The GP-GPU implementation is compared with a serial implementation of a conventional matrix multiplication O(n3). The GP-GPU implementation achieved upto 6.2x speedup over the conventional serial multiplication. The results for both applications with varying problem sizes are presented and discussed
    • …
    corecore