2,318 research outputs found

    Analyze Large Multidimensional Datasets Using Algebraic Topology

    Get PDF
    This paper presents an efficient algorithm to extract knowledge from high-dimensionality, high- complexity datasets using algebraic topology, namely simplicial complexes. Based on concept of isomorphism of relations, our method turn a relational table into a geometric object (a simplicial complex is a polyhedron). So, conceptually association rule searching is turned into a geometric traversal problem. By leveraging on the core concepts behind Simplicial Complex, we use a new technique (in computer science) that improves the performance over existing methods and uses far less memory. It was designed and developed with a strong emphasis on scalability, reliability, and extensibility. This paper also investigate the possibility of Hadoop integration and the challenges that come with the framework

    A Hybrid Multi-GPU Implementation of Simplex Algorithm with CPU Collaboration

    Full text link
    The simplex algorithm has been successfully used for many years in solving linear programming (LP) problems. Due to the intensive computations required (especially for the solution of large LP problems), parallel approaches have also extensively been studied. The computational power provided by the modern GPUs as well as the rapid development of multicore CPU systems have led OpenMP and CUDA programming models to the top preferences during the last years. However, the desired efficient collaboration between CPU and GPU through the combined use of the above programming models is still considered a hard research problem. In the above context, we demonstrate here an excessively efficient implementation of standard simplex, targeting to the best possible exploitation of the concurrent use of all the computing resources, on a multicore platform with multiple CUDA-enabled GPUs. More concretely, we present a novel hybrid collaboration scheme which is based on the concurrent execution of suitably spread CPU-assigned (via multithreading) and GPU-offloaded computations. The experimental results extracted through the cooperative use of OpenMP and CUDA over a notably powerful modern hybrid platform (consisting of 32 cores and two high-spec GPUs, Titan Rtx and Rtx 2080Ti) highlight that the performance of the presented here hybrid GPU/CPU collaboration scheme is clearly superior to the GPU-only implementation under almost all conditions. The corresponding measurements validate the value of using all resources concurrently, even in the case of a multi-GPU configuration platform. Furthermore, the given implementations are completely comparable (and slightly superior in most cases) to other related attempts in the bibliography, and clearly superior to the native CPU-implementation with 32 cores.Comment: 12 page

    Solution of the Skyrme-Hartree-Fock-Bogolyubov equations in the Cartesian deformed harmonic-oscillator basis. (VII) HFODD (v2.49t): a new version of the program

    Full text link
    We describe the new version (v2.49t) of the code HFODD which solves the nuclear Skyrme Hartree-Fock (HF) or Skyrme Hartree-Fock-Bogolyubov (HFB) problem by using the Cartesian deformed harmonic-oscillator basis. In the new version, we have implemented the following physics features: (i) the isospin mixing and projection, (ii) the finite temperature formalism for the HFB and HF+BCS methods, (iii) the Lipkin translational energy correction method, (iv) the calculation of the shell correction. A number of specific numerical methods have also been implemented in order to deal with large-scale multi-constraint calculations and hardware limitations: (i) the two-basis method for the HFB method, (ii) the Augmented Lagrangian Method (ALM) for multi-constraint calculations, (iii) the linear constraint method based on the approximation of the RPA matrix for multi-constraint calculations, (iv) an interface with the axial and parity-conserving Skyrme-HFB code HFBTHO, (v) the mixing of the HF or HFB matrix elements instead of the HF fields. Special care has been paid to using the code on massively parallel leadership class computers. For this purpose, the following features are now available with this version: (i) the Message Passing Interface (MPI) framework, (ii) scalable input data routines, (iii) multi-threading via OpenMP pragmas, (iv) parallel diagonalization of the HFB matrix in the simplex breaking case using the ScaLAPACK library. Finally, several little significant errors of the previous published version were corrected.Comment: Accepted for publication to Computer Physics Communications. Program files re-submitted to Comp. Phys. Comm. Program Library after correction of several minor bug

    Scalable Empirical Dynamic Modeling With Parallel Computing and Approximate k-NN Search

    Get PDF
    Empirical Dynamic Modeling (EDM) is a mathematical framework for modeling and predicting non-linear time series data. Although EDM is increasingly adopted in various research fields, its application to large-scale data has been limited due to its high computational cost. This article presents kEDM, a high-performance implementation of EDM for analyzing large-scale time series datasets. kEDM adopts the Kokkos performance-portable programming model to efficiently run on both CPU and GPU while sharing a single code base. We also conduct hardware-specific optimization of performance-critical kernels. kEDM achieved up to 6.58× speedup in pairwise causal inference of real-world biology datasets compared to an existing EDM implementation. Furthermore, we integrate multiple approximate k-NN search algorithms into EDM to enable the analysis of extremely large datasets that were intractable with conventional EDM based on exhaustive k-NN search. EDM-based time series forecast enhanced with approximate k-NN search demonstrated up to 790× speedup compared to conventional Simplex projection with less than 1% increase in MAPE.journal articl

    Limits on Fundamental Limits to Computation

    Full text link
    An indispensable part of our lives, computing has also become essential to industries and governments. Steady improvements in computer hardware have been supported by periodic doubling of transistor densities in integrated circuits over the last fifty years. Such Moore scaling now requires increasingly heroic efforts, stimulating research in alternative hardware and stirring controversy. To help evaluate emerging technologies and enrich our understanding of integrated-circuit scaling, we review fundamental limits to computation: in manufacturing, energy, physical space, design and verification effort, and algorithms. To outline what is achievable in principle and in practice, we recall how some limits were circumvented, compare loose and tight limits. We also point out that engineering difficulties encountered by emerging technologies may indicate yet-unknown limits.Comment: 15 pages, 4 figures, 1 tabl
    • …
    corecore